Load .DAT file from S3 into Redshift Using AWS Glue

0

I receive a file from external vendor. The file is in .dat format. Once the file arrives into my S3 bucket, I have to trigger a AWS Glue job to read the file and load into my Redshift table. I have the logic and it is working for CSV, Parquet files. But that is not working for DAT files. Below is my code:

order_lines = glueContext.create_dynamic_frame.from_options( format_options={ "quoteChar": '"', "withHeader": True, "separator": "|", "optimizePerformance": False, }, connection_type="s3", format="dat", connection_options={ "paths": [ f"s3://mybucket/staging/optical/order_lines_summary/OPTICAL_ORDER_LINE_SUMMARY_2024-02-26-01-00-00-404.dat" ], "recurse": True, }, transformation_ctx="order_lines", )

Can anyone please help on this?

2 Answers
2

Hello, dat is not a valid glue format, in your case you would still use "csv" as the format and "|" separator in the format options. Please take a look at this link for the comprehensive list of formats supported by glue. https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format.html

AWS
answered 2 months ago
0

The same code worked for me to read a DAT file. My issue was, initially the S3 bucket I was referring was not accessible by AWS Glue. After modifying that to the my own S3 bucket location, I was able to read the file. Thanks.

Joe
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions