- Newest
- Most votes
- Most comments
When you create a project in Amazon SageMaker Unified Studio, AWS Glue Data Catalog databases and Amazon Redshift resources are automatically provisioned as part of the project setup. This is by design and serves specific purposes:
- The purpose of automatically creating these resources:
- The AWS Glue Data Catalog database serves as a central metadata store that allows you to catalog data from various sources, making it discoverable and accessible within your project.
- The Amazon Redshift resources (likely Amazon Redshift Serverless workgroups) are provisioned to enable data warehousing capabilities and to support querying data directly from within your SageMaker project.
- Together, these resources form part of the SageMaker Lakehouse architecture, which unifies data across S3 data lakes and Redshift data warehouses to enable analytics, machine learning, and AI applications on a single copy of data.
-
Yes, this is part of the default architecture for SageMaker Unified Studio projects. When creating a project, SageMaker automatically sets up AWS Glue Data Catalog databases, establishes a catalog for Redshift Managed Storage (RMS) data, and provisions necessary permissions to create a complete working environment.
-
Regarding costs and disabling automatic provisioning:
- The documentation doesn't explicitly mention a way to disable the automatic provisioning of Redshift resources.
- Since this is part of the core functionality of SageMaker Unified Studio projects, it may not be possible to disable it while still using the full project capabilities.
- You might want to check if there are project profile options or blueprints that create projects with fewer resources if you don't need the data warehousing capabilities.
- Regarding resource persistence after deletion:
- The behavior you're observing where Glue databases and Redshift clusters remain after project deletion appears to be by design.
- This persistence likely helps preserve your data and metadata even if you accidentally delete a project.
- You would need to manually delete these resources if you no longer need them.
These automatically provisioned resources enable SageMaker Unified Studio to provide an integrated environment where data scientists and analysts can collaborate, access data from various sources, and develop ML models without having to set up and configure these components separately.
Sources
Amazon SageMaker Lakehouse integrated access controls now available in Amazon Athena federated queries | AWS News Blog
Grant access to managed Amazon Redshift assets in Amazon SageMaker Unified Studio - Amazon SageMaker Unified Studio
Amazon SageMaker Unified Studio terminology and concepts - Amazon SageMaker Unified Studio
Amazon SageMaker Unified Studio terminology and concepts - Amazon SageMaker Unified Studio
Prepare, Integrate, and Orchestrate Data – Amazon SageMaker Data Processing and Analytics – AWS
Relevant content
- AWS OFFICIALUpdated a year ago

Since the Redshift is Serverless, if you do not use datawarehousing then there is no costs