The problem: dig service1.myprivatenamespace
from with in the container returns no ip address.
Infrastructure
The VPC has two subnets: private and public. Each subnet has been logically placed in their respected cluster (logically), both are in the same VPC
In the public cluster there is ecs proxy service, the task places Nginx proxy container in the public subnet.
In the private cluster there are db and web app services. Both tasks places their containers in the private subnet.
EC2 instance provider.
awsvpc network mode.
There is no load balancer in this.
EC2 instances for private/public subnet has security group assigned allowing the traffic.
Containers also have the SG assigned to them.
The idea is to expose port 80 via proxy to the public and forwards request to the web app container.
All three ecs services (proxy, db and web) are registered to cloud map.
db.qa-******-internal
webapp.qa-******-internal
proxy.qa-******-internal
These three ECS services are created in this order.
Background
- When ECS create the web app service, the task creates web app container. The ecs web service is kept in CREATION_in_PROGRESS state.
This is because:
- The web app container is timing out on creation the restarting and not entering stable state. The service is not yet registered on Cloud Map.
This is because:
- The web app container failed to establish database connection. DB connection is required for initial db migration when the web app starts for the first time. In CloudWatch log, I found this:
django.db.utils.OperationalError: (2005, "Unknown MySQL server host 'db.qa-*******-internal' (-2)")
DB host name is db.qa-******-interal
. using R53 DNS Service Discovery. I expect this is mapped to an private IP address. I can see the service instance and private IPv4 address and port in Cloud Map.
db.qa-*******-interal
is an ecs service registered to Cloud Map.
Namespace name qa-*****-internal
Namespace ID ns-7hh32r2fk4y5zdar
Service ID srv-hbckgvhgihrbrd24
DNS: A record and SRV, no health check, MULTIVALUE routing
IP 10.215.20.242
I ssh onto the EC2 instance where db container lives. I can see there is not port mapping between container and host
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ce74e20e997c mysql:5.7 "docker-entrypoint.s…" 4 hours ago Up 4 hours ecs-qa-******-db-10-mysql-98d1f28b9ea78fd55a00
8b428abf740d amazon/amazon-ecs-pause:0.1.0 "./pause" 4 hours ago Up 4 hours ecs-qa-******-db-10-internalecspause-e8d59bf3f0aea3fcf701
035811b8986e amazon/amazon-ecs-agent:latest "/agent" 4 hours ago Up 4 hours (healthy) ecs-agent
The PortMappings are specified in db task definition as
ContainerDefinitions:
- Name: mysql
Image: 'mysql:5.7'
Essential: true
Privileged: true
Memory: !Ref Memory
MemoryReservation: !Ref MemoryReservation
Cpu: !Ref Cpu97c4b10a-23e0-4018-89a8-9010b82ddec597c4b10a-23e0-4018-89a8-9010b82ddec5
PortMappings:
- ContainerPort: 3306
HostPort: 3306
Is the db running?
ECS task console says it is. I logged on to the EC2 instance then ssh into the container, and finally log onto mysql db:
- From within the container - yes I can do
myql -u root -p
to log in to mysql , so it is running.
- From within the container, I tried dig the hostname
dig db.qa-****-interal
.
apt-get update
apt install dnsutils
dig db.qa-******-internal
# dig db.qa-******-internal
; <<>> DiG 9.10.3-P4-Debian <<>> db.qa-*******-internal
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 27933
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;db.qa-******-internal. IN A
;; AUTHORITY SECTION:
. 984 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2020022100 1800 900 604800 86400
;; Query time: 8 msec
;; SERVER: 10.215.0.2#53(10.215.0.2)
;; WHEN: Fri Feb 21 11:53:58 UTC 2020
;; MSG SIZE rcvd: 119
So it is not returning the ip address from within the same container.
Why it is not returning the ip address?
Please help.