Sagemaker Pipeline connection to RDS times out but EC2 and Reachability Analyzer can connect? How to fix or debug?

0

We have a Sagemaker Notebook created via python .ipynb file. The preprocessing step sets the network_config like so:

Enter image description here

and runs the pipeline with outside network isolation:

Enter image description here

These are private subnets in VPC "ML". We have an RDS in VPC "Prod". The pipeline logs in CloudWatch show a timeout when trying to connect to RDS, which has public DNS and a security group. The security group for the pipeline has outbound access to anything (and the rds sg for good measure), and the RDS sg has inbound for the pipeline sg (and the CIDR of the PeerConnection).

I created an EC2 instance on the private ML VPC subset with the pipeline SG. This could reach the RDS.

I used Reachability Analyzer to trace that instance to the local IP of RDS. It worked, labeld "Reachable", like this:

Enter image description here

The security group and subnet of that instance are the same ones specified in the pipeline network_config.

How can I diagnose the Sagemaker Pipeline? Why will it not connect to RDS when my EC2 instance does connect? Any help would be great, thank you.

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions