When to expose a new microservice?
Microservice has to expose a well defined functionality with significant code. It’s difficult to define what is significant code so how to make a decision about new microservice.
- If some service is turning into monolith. Various kind of functionality being added frequently into service
- Performance is becoming a major bottleneck because some apis are getting hit harder compared to other and it cant be taken care in existing microservice
- Too many frequent changes in one part of service and risk of breaking existing functionality
- HA requirements are drastically different
- Business wise it makes sense to break modules in different services
- A specific technology needs to be used to solve a specific problem
- Investment is justified and available
Microservice comes with lot of overhead so its mandatory that a thorough brainstorming has been done before adding new service in the system.
Provisioning of Microservice
Before service goes to production infrastructure provisioning ticket has to be opened 2 days before by manager with following guidelines about each resource type
Performance Measurement Guidelines
Sizing of machine/docker containers
Before you deploy your application you must understand performance characteristics of your service. Every service serves a different purpose hence their performance requirements will be different.
- Some Services deal with lot of data hence their memory requirement will be different
- Some services like log collector read lot of data and transport it to separate end point
- Some services get lot of requests so responding in time is critical
- Some services are heavily dependent on database so depending on type of database their performance characteristics change
There are some parameters to think about
1st step is to identify most critical APIs.
Depending on Cars24 load characteristics make sure you have done load test with these numbers
1. 20 requests / second for most frequently used APIs
2. Per API SLA should not exceed more than 100ms
CPU – Is your service CPU Intensive
Memory – How much data at any given point in time will be hold in memory
Networking – Does your service require special networking capabilities? Usually these applications deal with extensive data ingestion like 100MBPs.
IOPS – Most applications are IO Bound meaning rather than CPU Database operations will consume lot of time. So you should observe your applications behavior in production to understand where is the bottleneck. Usually applications CPU will spike because underlying database and real problem is not with application code base. In this case Caching must be considered as an alternative either using Redis or Elastic Search
Kind of questions you should ask from first principle while provisioning EC2 Machine
EC2 (Fargate Cluster)
- how many requests per second you are expecting?
- does this require High Availability
- Does it expose HTTP API?
- Can it be shutdown during night?
- Can it be reduced during night? If yes then what type? If No then why not?
- Hard Disk required?
- Is it temporary? If yes please specify the date on which it will be stopped
- Just like EC2 Machines Docker Containers Sizing is necessary while deploying applications
- RAM, CPU, Number of Tasks and Storage are main factors
- Each Fargate container is deployed as ECS “Service”. Service deployment helps in service discovery using Route53 and is completely automated
- CPU and RAM are defined in multiples of 1024
- If you require 2 cores then define 2048 as CPU and RAM will be minimum 4*1024 = 4096 (4GB).
- Fargate capacity is available on in slabs. For 1 CPU (1024) Memory comes with 2048 so be careful that you are asking about right CPU
- Nature of Service (HTTP/TCP)
- Health Check has to be defined
- Health Check interval has to be defined
- Is your application in POC Phase? If yes can it share database server with other applications?
- Whats the default connection pool size of your application? If Its a poc can it be set to 2. Connections eat up resources at both application end and database has to use lot of CPU/Memory to maintain them
- Connection Pool size to be limited to 10 assuming there will be 2 instances running. If a request can complete within 100ms
- 1 connection = 10 requests per second
- 10 connection = 10×10 = 100 requests per second per server
- 2 servers = 100×2 = 200 requests per second.
- How many we need? 20 requests per second?
- Will database be deleted after few days? if yes then define a date
- How many IOPS will be required
- How much disk size will be required – provide estimation logic
- By default we use MySQL for storing all microservices data. Version of MySQL has to be 5.7
- Redis is an extremely fast and efficient caching store and provides lot of functionality
- You must state what kind of operations you are going to do on redis because operations define whether Redis instance requires higher CPU
- Redis throughput is measured in thousands of requests per second so application should be able to handle heavy load on small instance of redis
- Logs put tremendous stress on systems. They consume lot of IOPS and also result in degraded performance when used extensively. Use them in useful places
- Use proper context in logs by capturing useful information like userid/lead-id but not personal information like phone number