SSMAgent fails to connect on some instances

0

I'm trying to get SSM to manage some Windows 2019 instances. I follow all the steps as far as I can tell. On some machines it works, on some it fails. When it fails, I get this in the logs:

error when calling AWS APIs. error details - GetMessages Error: NoCredentialProviders: no valid providers in chain. Deprecated.

Google searches tell me that's probably the IAM role, but I use the same role on my working instances.

Question 1: Is there ever an error with ssmagent being able to use the role that's assigned in the EC2 Console?
I also see this error:

no instanceID provided, Failed to fetch instance ID. Data from vault is empty. RequestError: send request failed
caused by: Get http://169.254.169.254/latest/meta-data/instance-id: dial tcp 169.254.169.254:80: connectex: A socket operation was attempted to an unreachable network.

The instances that fail have an additional ethernet adapter:

Ethernet adapter vEthernet (nat):

   Connection-specific DNS Suffix  . :
   Link-local IPv6 Address . . . . . : fe80::e43a:dd9f:a811:1e6%20
   IPv4 Address. . . . . . . . . . . : 172.27.192.1
   Subnet Mask . . . . . . . . . . . : 255.255.240.0
   Default Gateway . . . . . . . . . :

The instances that work don't have the vEthernet Adapter.

Question 2: if this vEthernet is blocking SSMAgent from understanding it's configuration, how can I work around it?

Gwindor
asked 4 years ago2228 views
2 Answers
0

Hi,

Looking at the details provided, it seems your Windows 2019 instance is not able to reach EC2 metadata address "169.254.169.254", EC2 instance fetches IAM role/profile and other key details from metadata location. Mostly this looks to be network routing issue on the instance that is causing connectivity issue with metadata IP. Can you please below PowerShell cmdlet manually and see what error is reports ?

Invoke-WebRequest -Uri "http://169.254.169.254/latest/meta-data/"

Could you also please run below command to see if metadata routes are correctly configured on your instance ?

"route print"

Look at the result above cmd "route print" and if you find that metadata routes are not correctly configured then you can execute below command to add metadata routes correctly. The Gateway address here should be of default AWS PV Network interface not vEthernet Adapter.

route -p ADD 169.254.169.254 MASK 255.255.255.255 GATEWAYADDRESS

Regarding your question about vEthernet Adapter, you can try disabling it and see if the issue resolves. I would like to let you know that EC2 Windows instance will have only "AWS PV Ethernet" adapter by default and why you are seeing other Hyper-V vEthernet Adapter (NAT) possibly because instances were launched using Windows Container based AMIs not general Windows AMIs. So if routes are going via vEthernet Adapter (NAT) it would create an issue as per my understanding. Windows containers function similarly to virtual machines in regards to networking. Each container has a virtual network adapter (vNIC) which is connected to a Hyper-V virtual switch (vSwitch). Windows supports five different networking drivers or modes which can be created through Docker: nat, overlay, transparent, l2bridge, and l2tunnel. For more information on this please do refer below doc from Microsoft.

https://docs.microsoft.com/en-us/virtualization/windowscontainers/container-networking/architecture

I hope this helps, please respond back if any further assistance required.

answered 4 years ago
0

Thank you Ajeet for helping.

169.254.169.254 was in the "persistent routes" but not the "active routes" sections of "route print"

the route -p ADD ... command made it work!

I would like to know more about the persistent vs active routes, but that's a study topic for another day.

Gwindor
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions