Containers stop responding on public IP

0

Hi there!

I've got a few services (Prefect and MLflow) running in ECS. These have public IPs as well as private IPs with an associated service discovery endpoint.

Access using the public IP is shaky (especially for MLflow, but both are affected). After a while, the services become unreachable using the public IP however they still respond to requests using the private IP or service discovery endpoint.

Spawning a new container seems to fix this, and services (usually, not always) become reachable on their new public IPs right away, but they stop working again "after a while". At this point, the services stop responding to pings as well.

I think the containers themselves are probably fine, since they do respond to traffic that reach them. I'm really at a loss here, and would be grateful for any input.

// R

  • Can you elaborate more about the networking configuration? Like, what is the default gateway configured? What about the SG and the NACL rules? What is the error that you are receiving when you are unable to connect to the public IP (please, provide the curl -vI output).

1개 답변
1
수락된 답변

The issue was that containers were allowed to (re)spawn in any subnet in the VPC (I think it's random?).

Some of these had configurations which were not suitable for our services - traffic could get in, but services were not permitted to respond. Confirmed by spawning a bunch of containers and seeing which ones I can access.

The solution is to recreate the services with more carefully selected subnets.

Richard
답변함 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠