ECS service ip address not found as promised by R53/Service Discovery

0

The problem: dig service1.myprivatenamespace from with in the container returns no ip address.

Infrastructure
The VPC has two subnets: private and public. Each subnet has been logically placed in their respected cluster (logically), both are in the same VPC
In the public cluster there is ecs proxy service, the task places Nginx proxy container in the public subnet.
In the private cluster there are db and web app services. Both tasks places their containers in the private subnet.
EC2 instance provider.
awsvpc network mode.
There is no load balancer in this.
EC2 instances for private/public subnet has security group assigned allowing the traffic.
Containers also have the SG assigned to them.

The idea is to expose port 80 via proxy to the public and forwards request to the web app container.

All three ecs services (proxy, db and web) are registered to cloud map.

db.qa-******-internal
webapp.qa-******-internal
proxy.qa-******-internal

These three ECS services are created in this order.

Background

  • When ECS create the web app service, the task creates web app container. The ecs web service is kept in CREATION_in_PROGRESS state.
    This is because:
  • The web app container is timing out on creation the restarting and not entering stable state. The service is not yet registered on Cloud Map.
    This is because:
  • The web app container failed to establish database connection. DB connection is required for initial db migration when the web app starts for the first time. In CloudWatch log, I found this:
django.db.utils.OperationalError: (2005, "Unknown MySQL server host 'db.qa-*******-internal' (-2)")  

DB host name is db.qa-******-interal. using R53 DNS Service Discovery. I expect this is mapped to an private IP address. I can see the service instance and private IPv4 address and port in Cloud Map.
db.qa-*******-interal is an ecs service registered to Cloud Map.

Namespace name qa-*****-internal
Namespace ID ns-7hh32r2fk4y5zdar
Service ID srv-hbckgvhgihrbrd24
DNS: A record and SRV, no health check, MULTIVALUE routing
IP 10.215.20.242
I ssh onto the EC2 instance where db container lives. I can see there is not port mapping between container and host

$ docker ps
CONTAINER ID        IMAGE                            COMMAND                  CREATED             STATUS                 PORTS               NAMES
ce74e20e997c        mysql:5.7                        "docker-entrypoint.s…"   4 hours ago         Up 4 hours                                 ecs-qa-******-db-10-mysql-98d1f28b9ea78fd55a00
8b428abf740d        amazon/amazon-ecs-pause:0.1.0    "./pause"                4 hours ago         Up 4 hours                                 ecs-qa-******-db-10-internalecspause-e8d59bf3f0aea3fcf701
035811b8986e        amazon/amazon-ecs-agent:latest   "/agent"                 4 hours ago         Up 4 hours (healthy)                       ecs-agent

The PortMappings are specified in db task definition as

      ContainerDefinitions:
        - Name: mysql
          Image: 'mysql:5.7'
          Essential: true
          Privileged: true
          Memory: !Ref Memory
          MemoryReservation: !Ref MemoryReservation
          Cpu: !Ref Cpu97c4b10a-23e0-4018-89a8-9010b82ddec597c4b10a-23e0-4018-89a8-9010b82ddec5
          PortMappings:
            - ContainerPort: 3306
              HostPort: 3306

Is the db running?
ECS task console says it is. I logged on to the EC2 instance then ssh into the container, and finally log onto mysql db:

  1. From within the container - yes I can do myql -u root -p to log in to mysql , so it is running.
  2. From within the container, I tried dig the hostname dig db.qa-****-interal.
apt-get update
apt install dnsutils
dig db.qa-******-internal
# dig db.qa-******-internal

; <<>> DiG 9.10.3-P4-Debian <<>> db.qa-*******-internal
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 27933
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;db.qa-******-internal.	IN	A

;; AUTHORITY SECTION:
.			984	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2020022100 1800 900 604800 86400

;; Query time: 8 msec
;; SERVER: 10.215.0.2#53(10.215.0.2)
;; WHEN: Fri Feb 21 11:53:58 UTC 2020
;; MSG SIZE  rcvd: 119

So it is not returning the ip address from within the same container.

Why it is not returning the ip address?
Please help.

已提问 4 年前1017 查看次数
1 回答
0

I found the problem.
In Route 53 check the hosted zone is associated to correct VPC
Check following Amazon VPC settings to true:

  • enableDnsHostnames
  • enableDnsSupport

In this case, I have refactored the ServiceDiscovery - PrivateDnsNamespace stack out of main ecs stack at some point. But I forgot to put in place updating
VPC id that associated with it. So it has an obsoleted value.
(Route 53 CF stack can't handle name space re-association by updating CF stack, although you can create new ones with duplicated name.

已回答 4 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则