How to configure insert order of columns into AWS RedShift from AWS Glue

0

Hi there,

I have an AWS Glue Pipeline where I am reading data from an RDS Postgres instance and copying data over into a RedShift warehouse. I have noticed that when using the "MERGE" operation, that the order of the columns in the output format matters. If the column order of the output format does not match the column order of my Redshift table, then inserts fail.

To resolve this problem, I have to manually edit the JSON of my Data Catalog schema to match the order of my Redshift columns.

The issue is, is that whenever I recrawl my Postgres database, the column order in my Data Catalog schema gets reset and I have to manually change the order again. Is there a way to either a) preserve the order of the schema, or b) somehow reorder the output in the visual editor?

Thanks, Logan

  • I guess if you recrawl is to pick up new columns and in that case I don't see how a visual job could keep the order (I was thinking of a SelectFields transforms) while including new fields it doesn't know about (not present when the job was defined).

  • I suppose thats true, although the new column could just get appended to the end. What's annoying is that the order of ALL the tables is reset, not just the table with the new column.

profile picture
asked 10 months ago622 views
1 Answer
0

Hello Logan, Based on your query I have created a glue job with RDS Postgres DB as source and Redshift as a target. Created a table with id firstname and lastname in Postgres and crawled the table. Ran the job with auto generated script from glue which is showing order lastname, id and firstname. The job is failing as you mentioned. After reordering all the occurrence id, firstname and lastname the glue job is running successful. Currently reording of columns from Glue studio feature is unavailable. For data copying if you are using Glue job then editing the glue job once will resolve the issue, even though the recrawl resulting in changing of order.

AWS
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions