1 Answer
- Newest
- Most votes
- Most comments
0
Hi,
I imagine that you are using Glue Studio and checking the box Upsert in the target node.
AWS Glue implement the Upsert following Redshift best practices, using a pre and post sql , the whole flow is:
- pre-sql creates a Staging table
- insert all the new records in a staging table
- post SQL
- uses the key defined to first delete all rows from the target table that do exists in the staging table
- insert all rows from the staging table in the target table
- drop the staging table
based on this you can understand that you do not have to worry about the columns you do not want to update, just keep them unchanged and they will not be modified during the process. This will not have any performance impact.
hope this helps and is clear
Relevant content
- asked 5 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
Thanks for your response. Adding more information about the issue.
While creating staging table, glue creating it by using target table structure. So the staging table structure is col-1, col-2, ...col-6. As I did transformation only for col-2, col-3, col-4, while inserting into staging table, it throwing error like below:
From the exception, it is expecting col1 in the transformation. So it is stopping us in the ETL operation. Much appreciated if we get solution for this.
Hi ,
I would like to know how to do upsert operation in aws glue for non redshift databases using jdbc connection.
@AWS-User-2414105 Really sorry for such a late reply, I might have missed the notification of your comment.
What I meant is that you should add the other columns to the transformation but leave them unchanged, in this way they will also be replaced in the last step but with the current value.
hope this is clear now
@AWS-User-9029218 you could implement it using an external library to run some pure SQL statement, see this answer https://repost.aws/questions/QUQ6gsQY2CQgWfvdbwI8HmVg 2) you can implement something similar to what described in this external blog post: https://medium.com/@thomaspt748/how-to-upsert-data-into-relational-database-using-spark-7d2d92e05bb9