Load .DAT file from S3 into Redshift Using AWS Glue

0

I receive a file from external vendor. The file is in .dat format. Once the file arrives into my S3 bucket, I have to trigger a AWS Glue job to read the file and load into my Redshift table. I have the logic and it is working for CSV, Parquet files. But that is not working for DAT files. Below is my code:

order_lines = glueContext.create_dynamic_frame.from_options( format_options={ "quoteChar": '"', "withHeader": True, "separator": "|", "optimizePerformance": False, }, connection_type="s3", format="dat", connection_options={ "paths": [ f"s3://mybucket/staging/optical/order_lines_summary/OPTICAL_ORDER_LINE_SUMMARY_2024-02-26-01-00-00-404.dat" ], "recurse": True, }, transformation_ctx="order_lines", )

Can anyone please help on this?

2 Respuestas
2

Hello, dat is not a valid glue format, in your case you would still use "csv" as the format and "|" separator in the format options. Please take a look at this link for the comprehensive list of formats supported by glue. https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format.html

AWS
respondido hace 2 meses
0

The same code worked for me to read a DAT file. My issue was, initially the S3 bucket I was referring was not accessible by AWS Glue. After modifying that to the my own S3 bucket location, I was able to read the file. Thanks.

Joe
respondido hace 2 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas