How to configure insert order of columns into AWS RedShift from AWS Glue

0

Hi there,

I have an AWS Glue Pipeline where I am reading data from an RDS Postgres instance and copying data over into a RedShift warehouse. I have noticed that when using the "MERGE" operation, that the order of the columns in the output format matters. If the column order of the output format does not match the column order of my Redshift table, then inserts fail.

To resolve this problem, I have to manually edit the JSON of my Data Catalog schema to match the order of my Redshift columns.

The issue is, is that whenever I recrawl my Postgres database, the column order in my Data Catalog schema gets reset and I have to manually change the order again. Is there a way to either a) preserve the order of the schema, or b) somehow reorder the output in the visual editor?

Thanks, Logan

  • I guess if you recrawl is to pick up new columns and in that case I don't see how a visual job could keep the order (I was thinking of a SelectFields transforms) while including new fields it doesn't know about (not present when the job was defined).

  • I suppose thats true, although the new column could just get appended to the end. What's annoying is that the order of ALL the tables is reset, not just the table with the new column.

profile picture
已提問 1 年前檢視次數 700 次
1 個回答
0

Hello Logan, Based on your query I have created a glue job with RDS Postgres DB as source and Redshift as a target. Created a table with id firstname and lastname in Postgres and crawled the table. Ran the job with auto generated script from glue which is showing order lastname, id and firstname. The job is failing as you mentioned. After reordering all the occurrence id, firstname and lastname the glue job is running successful. Currently reording of columns from Glue studio feature is unavailable. For data copying if you are using Glue job then editing the glue job once will resolve the issue, even though the recrawl resulting in changing of order.

AWS
已回答 1 年前
profile picture
專家
已審閱 1 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南