I want to connect my Amazon SageMaker Studio notebook with an Amazon Redshift cluster.
Resolution
Publicly accessible cluster
If the Redshift cluster is publicly accessible, then access the cluster from either of the following:
- A SageMaker domain launched with public internet only and no Amazon Virtual Private Cloud (Amazon VPC) access
- A SageMaker Studio domain launched in an Amazon VPC
If the Amazon Redshift cluster is in a different VPC, then configure a VPC peering connection to make sure that SageMaker Studio can access the cluster.
Private cluster
If the Amazon Redshift cluster is private, then access the cluster only through a SageMaker Studio domain launched in an Amazon VPC. If the cluster is in a different VPC, configure a VPC peering connection to make sure that SageMaker Studio can access the cluster.
Additional requirements
Be sure that the following requirements are met for both types of clusters:
- The security group attached to the SageMaker Studio allows outbound traffic to ephemeral ports. If a SageMaker Studio client connects to an Amazon Redshift server, then a random port from the ephemeral port range (1024-65535) becomes the client's source port.
- The security group attached to the Amazon Redshift cluster allows inbound connection from the security group attached to the SageMaker Studio domain on port 5439.
- If you configured custom DNS, verify that the DNS server used by the SageMaker Studio VPC can resolve the hostname of the Amazon Redshift cluster.
Related information
Connect to data sources
Using the Amazon Redshift data API to interact from an Amazon SageMaker Jupyter notebook
Ingest data with Redshift