- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
Hi,
if your objective is to run purely some ELT , your most optimize cost option would be to use a Glue Python Shell job importing the Snowflake python connector, with this method you can execute your SQL code against Snowflake at the cost of 1/16 of a DPU.
You could also import the Snowflake python connector inside a Glue Spark ETL Job but your job would be mostly idle and you would overspend for the same operation.
The Glue Studio Connector for SnowFlake should work similarly to the Snowflake Connector for Spark. The main goal of this connector is to create a fast exchange of data between Snowflake and Spark, so for writing to Snowflake , it first write to S3 and then uses the Snowflake Copy command. It offer the ability to run some pre and post SQL but you would still need to load something into a staging table.
If you do some transformation in Spark, load the DataFrame to a Snowflake table and then you need to run your Snowflake SQL , the Glue Studio Connector for SnowFlake with a post action would be the best choice.
The Glue Studio SQL transform will implement your code in SparkSQL, and it is currently meant for ETL not ELT.
Contenuto pertinente
- AWS UFFICIALEAggiornata 2 anni fa
- AWS UFFICIALEAggiornata 2 anni fa
- AWS UFFICIALEAggiornata 2 anni fa
Thank you for clarifying. That is really helpful. I'll try going the Python Shell route.