Connection reset error in ECS cluster between two microservices running on two containers.
Hi, We are having two microservices running in separate containers in ECS cluster, we are getting "connection reset" error in microservice when it tries to call another microservice over REST. We are using Elastic Load Balancer and we verified that idle timeout on ELB is sufficiently large.
Does anyone faced this issue before? Any pointers will be helpful.
Thank You in advance. Vishwas
Thank you for quick response. Really appreciate it.
Communication between microservices is standard REST based HTTP communication. We observed connection reset error frequent.
Another observation is if it's more likely to happen when the service has been idle for some time.
We are using application load balancer. We are trying to check ELB logs to see if we get anything there.
Do you think setting header "Keep-Alive" will help?
Maybe, you should try adding the keep-alive header. Also check with your backend service for ideal time out settings as well
What is the communication patter looks like and is the TCP connection linger with no packet for a long period of time or the connection reset happens very often?
Also What type of ELB you are using?
For example, in NLB,
Elastic Load Balancing sets the idle timeout value for TCP flows to 350 seconds. You cannot modify this value.
That said, connection reset should not be a big problem, especially if they don't occur often, this is because the client application can always open a new TCP connection at the error handling logic. As matter of fact, if the application is failing because of this, it indicates issues with the code in terms of gracefully handle error conditions.
As indicated in the documentation, TCP Keep-alive is a viable option. Or you can set application code to timeout before TCP connection is closed.
It should also be noted that the backend server should also have keep alive values larger than the ELB's setting.
ECS tasks in a Cluster, not linking between them.asked 2 months ago
Direct Connect Failover with two Virtual Interfaces (VIFs)Accepted Answerasked a year ago
Connection problem between two users in a UEProject with GameLiftAccepted Answerasked 5 months ago
Getting error "failed to configure cluster endpoints from hosts" for Amazon DAX Clusterasked 2 months ago
EKS Cluster stuck in updatingasked 2 months ago
Are the environment variables used in the task definitions for the ECS service encrypted?Accepted Answerasked 6 years ago
How can we increase the speed of the container running in AWS ECS ?asked 2 months ago
Connection refused by public ip address, although image pulled from ECR running appropriately in pod containerasked 3 months ago
Connection reset error in ECS cluster between two microservices running on two containers.asked 4 months ago
ECS containers running on private subnet can not reach elastic searchAccepted Answerasked 3 years ago