- Newest
- Most votes
- Most comments
Hi,
if your objective is to run purely some ELT , your most optimize cost option would be to use a Glue Python Shell job importing the Snowflake python connector, with this method you can execute your SQL code against Snowflake at the cost of 1/16 of a DPU.
You could also import the Snowflake python connector inside a Glue Spark ETL Job but your job would be mostly idle and you would overspend for the same operation.
The Glue Studio Connector for SnowFlake should work similarly to the Snowflake Connector for Spark. The main goal of this connector is to create a fast exchange of data between Snowflake and Spark, so for writing to Snowflake , it first write to S3 and then uses the Snowflake Copy command. It offer the ability to run some pre and post SQL but you would still need to load something into a staging table.
If you do some transformation in Spark, load the DataFrame to a Snowflake table and then you need to run your Snowflake SQL , the Glue Studio Connector for SnowFlake with a post action would be the best choice.
The Glue Studio SQL transform will implement your code in SparkSQL, and it is currently meant for ETL not ELT.
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
Thank you for clarifying. That is really helpful. I'll try going the Python Shell route.