Connection reset error in ECS cluster between two microservices running on two containers.

0

Hi, We are having two microservices running in separate containers in ECS cluster, we are getting "connection reset" error in microservice when it tries to call another microservice over REST. We are using Elastic Load Balancer and we verified that idle timeout on ELB is sufficiently large.

Does anyone faced this issue before? Any pointers will be helpful.

Thank You in advance. Vishwas

3 Answers
0

Hi,

Thank you for quick response. Really appreciate it.

Communication between microservices is standard REST based HTTP communication. We observed connection reset error frequent.

Another observation is if it's more likely to happen when the service has been idle for some time.

We are using application load balancer. We are trying to check ELB logs to see if we get anything there.

Do you think setting header "Keep-Alive" will help?

Thank You,

Vishwas

Vishwas
answered 2 years ago
  • Maybe, you should try adding the keep-alive header. Also check with your backend service for ideal time out settings as well

0

What is the communication patter looks like and is the TCP connection linger with no packet for a long period of time or the connection reset happens very often?

Also What type of ELB you are using?

For example, in NLB,

Elastic Load Balancing sets the idle timeout value for TCP flows to 350 seconds. You cannot modify this value. 

That said, connection reset should not be a big problem, especially if they don't occur often, this is because the client application can always open a new TCP connection at the error handling logic. As matter of fact, if the application is failing because of this, it indicates issues with the code in terms of gracefully handle error conditions.

As indicated in the documentation, TCP Keep-alive is a viable option. Or you can set application code to timeout before TCP connection is closed.

Jason_S
answered 2 years ago
0

It should also be noted that the backend server should also have keep alive values larger than the ELB's setting.

Jason_S
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions