Working with .sas7bdat file and AWS Glue

0

I am working with .sas7bdat file stored in my s3 bucket

I want to convert the sas7bdat file to csv but in glue visual etl I cannot see an option for sas7bdat file format

Can someone please help me with this?

1 Answer
2
Accepted Answer

AWS Glue does not natively support the .sas7bdat file format (SAS data file) in its Visual ETL tool. However, you can convert the .sas7bdat file to a CSV format by following a workaround using AWS Glue's Python Shell jobs or Glue Spark jobs. Here's how you can approach this using Python Shell Job:

You can use a Python Shell job in AWS Glue to read the .sas7bdat file from S3, convert it to a DataFrame, and then save it as a CSV file back to S3. Here's what the script will look like:

from sas7bdat import SAS7BDAT
import pandas as pd

# Read the .sas7bdat File
with SAS7BDAT('s3://your-bucket/your-file.sas7bdat') as file:
    df = file.to_data_frame()


# Write to CSV:
df.to_csv('s3://your-bucket/output-folder/your-file.csv', index=False)

Run the Job: Execute the Glue job, and it will read the .sas7bdat file, convert it to a CSV, and save it back to S3.

Ensure that your IAM roles and permissions are correctly set up to allow AWS Glue to access the S3 buckets and perform the necessary operations.

If this has resolved your issue or was helpful, accepting the answer would be greatly appreciated. Thank you!

profile picture
EXPERT
answered 3 months ago
profile picture
EXPERT
reviewed a month ago
  • Thank you for the solution!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions