- Newest
- Most votes
- Most comments
You actually can use a RedShift serverless table as your target in AWS Glue Studio jobs. The catch is that you need to add the target RedShift Serverless table to your Data Catalog first, either manually or using a Glue Crawler.
This is noted in here: https://docs.aws.amazon.com/glue/latest/ug/data-target-nodes.html.
"For all data sources except Amazon S3 and connectors, a table must exist in the AWS Glue Data Catalog for the target type that you choose. AWS Glue Studio does not create the Data Catalog table." There is also fine print in the Glue Studio job itself. Under "Amazon RedShift" is says "AWS Glue Data Catalog table with RedShift as the data target".
Once you the have added it, you can then use it as a source or target in the Glue Studio Job. Use the drop down to select "Amazon RedShift" as the target. Note that the database and table name will be based on the Glue Catalog table name, not whatever you named it in the actual RedShift Cluster.
To use it with a Crawlers, you need to first add a Connection in Glue -- RedShift serverless will be a JDBC Connection. Then when creating the crawler, select "JDBC" as your Data source and choose the RedShift connection that you created.
Hope that helps!
Thank you for your response. I will look into this closer. I appreciate your guidance.
The other option is to not use a table and just specify the cluster details when you make the call to glueContext.create_dynamic_frame_from_options("redshift", connection_options)
Unfortunately, Glue Studio does not currently support Serverless Redshift as a target. You'll need to use another approach to transfer data from your S3 bucket to your Redshift database. Some options include:
Use a traditional Redshift cluster as the target in Glue Studio and then use Amazon Redshift Spectrum to query your Serverless Redshift database from the Redshift cluster.
Use a different AWS service such as Amazon Data Pipeline, AWS Lambda, or AWS Step Functions to transfer the data from S3 to Serverless Redshift.
Use a Spark job with the Redshift JDBC driver to transfer the data.
Use the Redshift COPY command to transfer the data directly from S3 to Serverless Redshift.
Thank you for the quick response. I'll investigate your suggestions.
Relevant content
- AWS OFFICIALUpdated 6 days ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 5 months ago
Hello! I'm going through a similar scenario- were you able to get this resolved?
@UICVA I never was able to get this to work. We ended up going a different direction away from Redshift and instead using RDS. We are now working on building Lambda jobs to perform our ETL.
Thank you so much for responding! Will explore other options!