Fragate Ecs Task pulling images via endpoints in a private subnet has strange network logs.

0

This is a experimentation result i did on my fargate task withh vpc flow logs to capture how its pulling the image from ecr and how i can toghten my security more via NACL and securoty groups! But I was amazed to see that my VPC endpoints are pinging my task?? But i was expecting the scene where my task should be requesting to vpc endpoints!

all the ingress traffic are made via the interface vpc endpoints (req maker) and in reply my task is sending response! But for the gateway endpoint requests are being generated from task as expected to s3 public ip to my router!

whats the reason for end points pinging my task instead of task pinging my endpoints?

And I always thought it's the kernel which auto assigns source port! But here in logs seems like destination is having a random port whereas the src port is 443 ! Isn't it strange?

I am a very noob guy in AWS but always trying to learn by researching more in depth! So i am not a network specialist so sorry in advance if i asked something stupid!

fargate task ENI = eni-0316aecb1fb81b78a = 172.31.54.28

ecr interface endpoint ENI = 172.31.53.75

docker interface endpoint ENI = 172.31.68.156

public ip of s3 gateways: 16.12.5.114 , 16.12.5.50

srcPort srcIp destIP destPort eniId subnetId traffictype

443 172.31.53.75 172.31.54.28 36636 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

36636 172.31.54.28 172.31.53.75 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

38126 172.31.54.28 16.12.5.114 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

443 16.12.5.114 172.31.54.28 38126 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

443 172.31.53.75 172.31.54.28 36662 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

33294 172.31.54.28 172.31.68.156 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

59166 172.31.54.28 16.12.5.50 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

36662 172.31.54.28 172.31.53.75 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

443 172.31.53.75 172.31.54.28 36646 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

36646 172.31.54.28 172.31.53.75 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

33282 172.31.54.28 172.31.68.156 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

443 172.31.53.75 172.31.54.28 36620 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

33272 172.31.54.28 172.31.68.156 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

36620 172.31.54.28 172.31.53.75 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

443 172.31.68.156 172.31.54.28 33272 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

443 172.31.68.156 172.31.54.28 33282 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

443 16.12.5.50 172.31.54.28 59166 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

42636 172.31.54.28 172.31.51.28 443 eni-0316aecb1fb81b78a subnet-003baff457020174e egress

443 172.31.68.156 172.31.54.28 33294 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

443 172.31.51.28 172.31.54.28 42636 eni-0316aecb1fb81b78a subnet-003baff457020174e ingress

2 réponses
1
Réponse acceptée

I ran out of charaters in my comment so putting here in an answer..

Thats Great to hear Rahat. NACL's are kind of a black art and take more to get your head around. So here goes. I will use your VPC flow log as a reference as it makes it easier to explain. NACLS as you say are stateless, so the return traffic isnt automaticlly allowed.

If you remember your VPC flow log had 2 entries for 1 TCP Communication. It had the ECS task talking to the VPC endpoint and then the VPC endpoint talking back to the ECS Task. Remember NACLs are only applied at the subnet level. If you have 2 resources in the same subnet talking to each other, then the NACL will never be applied. Its only applied when traffic leaves or comes into the subnet.

Your NACLs have to be configured to allow that excact conversation flow. For this example lets concentrate on VPC Endpoints being in their own subnet and we are seting up an NACL on the VPC Endpoint subnet as follows.

INBOUND :- So you would need to allow ECS IP Address/CIDR to the destination port of 443 which is similar to a security group and source would be ECS Task subnet/CIDR... Thats the easy part.

OUTBOUND :- Gets a little more complicated. You need to allow the return traffic to go back to the source ports of where the traffic came from. The outbound rule would need allow traffic TO the Ephemeral ports which you need to look up for each OS for the traffic going back to the host(s). If you remember your VPC flow log above had an ephemeral port of 36620. This changes on each new TCP Converstaion which is from a range. You need to ensure your outboud NACL rule needs to allow traffic back on the Ephemeral range to where the traffic came from. So for linux is generally port 32768-60999 and your DST CIDR Range would be the subnet where your ECS tasks are.

ECS SUBNETs :- Again, you would need to setup NACLs on the subnet where your ECS task is, which would be the opposite of the VPC Endpoint subnet. Outbound would be port 443 CIDR, DST would be the VPC Endpoint Subnet CIDR. And INBOUND would be the Ephemeral range and Source would be the VPC Endpoint CIDR.

Security groups are statefull and ALLOW traffic only. Security groups is the lowest level of control at a resource level. NACLs are stateless and you can allow and BLOCK traffic at a port and IP Level. These are applied at a subnet level.

If I have help you understand and clarified the differences, I would apprecaite if you accepted one my answers.. Please comment with any questions.

Heres some links which may be useful too.

NACLs: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-network-acls.html#custom-network-acl

Ephemeral Port: https://en.wikipedia.org/wiki/Ephemeral_port

profile picture
EXPERT
répondu il y a 6 mois
profile picture
EXPERT
vérifié il y a 2 mois
  • Thank u so much Gary! For all your replies and effort to make me understand this confusion stuffs! Really want to express my gratitude for all the help! Thank you!

  • Your most welcome Rahat. Thank you for being patient. Hopefully Ive helped out here to clear up some intricacy of AWS. Come back anytime for anything else.

1

This isn’t a ping but TCP traffic.

The return traffic will always be from Port 443 to the source random port on the ECS task.

It’s not who’s started the call but just the natural TCP traffic flow. This is normal for TCP initiated connections.

You should always see the starting TCP conversion from the ECS task to port 443 endpoint and return traffic will be from the endpoint on port 443.

As you can see from this diagram imagine ECs is on the left and the VPC endpoint on the right (port 80) in this example. You will see the same behaviour.

https://packetlife.net/media/blog/attachments/429/tcp_flow.png

You could enable custom flow flogs to see the direction https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs-records-examples.html#flow-log-example-traffic-path

profile picture
EXPERT
répondu il y a 6 mois
profile pictureAWS
EXPERT
vérifié il y a 6 mois
  • If that's the case, then in my task security group, if I only allow outbound traffic to port 443 and hope incoming traffic on port 443 will be auto-allowed, it doesn't work for some reason. Therefore, I have to explicitly allow incoming traffic on port 443 in my inbound rules. This seems to contradict the knowledge of stateful firewalls!

    Similarly, I have another rule that allows outbound traffic on port 443 to the prefix list of S3 public IP addresses, and it works automatically without explicitly setting inbound rules!

    Other than that, my endpoint security group is allowing HTTPS traffic in and out for the VPC CIDR block, as stateful firewalls are only stateful for outbound traffic. For any unknown inbound connection, I have to explicitly give it in and out access. (but it should be something like all source port instead of 443 because for endpoint its an incoming traffic and the destination should be also allowing all port in outbound but that doesnt work)

    Similarly, my NACL is allowing all TCP traffic for the VPC CIDR and allowing outbound traffic to the prefix list CIDR of the S3 gateway on port 443. It also allows incoming traffic on all TCP ports (giving only port 443 dont works) for responses coming back from the S3 gateway endpoint, as NACLs are stateless!

    Help me solve the mytesteries of why NACL and Security Group is not behvaing as expected!

  • I’ve a feeling you’re getting mixed up with security groups on the VPC endpoint and the ECS tasks.

    The ECS security group needs to allow outbound on port 443. The VPc endpoint needs to allow inbound on port 443.

    Security groups are stateful and you don’t need to allow return traffic.

    Statefull firewalls are are both in and outbound! There are no security groups on the S3 gateway.

  • Now, are you using the same security group on the VPCendpoint and ecs task? If so it makes sense. Each security group is like a resource firewall. Just because security groups are shared doesn’t mean anything that’s using it allows traffic. You still have to have a rule for each resource/direction.

    If you have a separate security group for the endpoint and the ecs task it will make more sense to you.

  • Thanks for your replies!

    The thing i learn was a req has 2 parts src and destination! req from task : src: randomPort dest: 443 from task perspective : outbound for 443 should be allowed (as it expects destination from us) from endpoint persepective: inbound for randomPort be allowed (as it expects the source from where its is coming from) Then how allowing inbound on 443 is making sense to u in endpoint perspective is my knowledge wrong??

    Besides that i have tried what u said exactly ! It gives me this error: ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 3 time(s): RequestError: send request failed caused by: Post "https://api.ecr.ca-central-1.amazonaws.com/": dial tcp 172.31.65.77:443: i/o timeout. Please check your task network configuration.

    The only configuration that is work for me is: task-secure: allows https 443 vpc cidr (both inbound & outbound) - used it for task & endpoint s3-guard: allows https 443 s3-prefix-list (only outbound) used it for task private-nacl: allows all tcp for vpc cidr (inbound and outbound); allows s3 prefix list cidr on port 443(for outbound) ; allow s3 prefix cidr list on all emphiral ports (for inbound)

    Though I am a noob but just to let u know i am in these since last day almost i invested 12+ hrs just to understand whats actually happening! If u dont believe me try it it wont take more than 5 mins !

  • Your welcome. Yes your knowledge may know be knowing exactly how AWS works. Trying my best to explain. In the inbound rule you don’t allow the source port you still allowing access to the port your targeting ie 443. So the inbound rule on the endpoint will be port 443 and source ip or security group will be the ECS task or the vpc cidr range. Please use a different security group on the endpoint and only allow access inbound rule for port 443 and set the IP address to be the CIDR range of your vpc. It would be easier to show you in person how it works but text can be mis interpreted.

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions