Https call to API Gateway via VPC Endpoint fails to make connection intermittently

0

I have a private API gateway in its own account. It is used by clients having VPC Endpoint interfaces to execute-api service, and until now these have had Private DNS enabled, and there have been no issues.

A new client uses some existing public APIs, so Private DNS is disabled. However, they have had intermittent connectivity to the gateway during their testing.

I tried reproducing this from a second account with a test Lambda (node.js, v16, arm) in a VPC, using a VPC Endpoint with Private DNS disabled. I was able to reproduce the intermittent connectivity, but I can't understand why this happens. [Edit: The subnets attached to the VPC use the same security group, and this allows htttps ingress from 10.57.150.0/24]

I found that when using the generic endpoint DNS Name (no AZ marker in the name) the intermittent issue could be reproduced. If I switch to using the Endpoint DNS Names that include the AZ marker, then 1 of the DNS Names connected every time, but the other 2 (we use 3 AZs and 1 subnet per AZ) fail to connect with a timeout error. I added a call to resolve the hostname passed in, and all three hosts resolve to what I would expect (10.57.150.x), so I think this is a routing issue rather than DNS.

The route tables for all three subnets are the same, 2 routes for the s3 and DynamoDB prefix lists, a route for 10.57.150.0/24 and the remaining 0.0.0.0/0 going via a transit gateway instance. I'm not sure what other information I would need to add here.

Has anyone seen anything like this before?

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions