Working with .sas7bdat file and AWS Glue

0

I am working with .sas7bdat file stored in my s3 bucket

I want to convert the sas7bdat file to csv but in glue visual etl I cannot see an option for sas7bdat file format

Can someone please help me with this?

Cdash
已提问 3 个月前287 查看次数
1 回答
2
已接受的回答

AWS Glue does not natively support the .sas7bdat file format (SAS data file) in its Visual ETL tool. However, you can convert the .sas7bdat file to a CSV format by following a workaround using AWS Glue's Python Shell jobs or Glue Spark jobs. Here's how you can approach this using Python Shell Job:

You can use a Python Shell job in AWS Glue to read the .sas7bdat file from S3, convert it to a DataFrame, and then save it as a CSV file back to S3. Here's what the script will look like:

from sas7bdat import SAS7BDAT
import pandas as pd

# Read the .sas7bdat File
with SAS7BDAT('s3://your-bucket/your-file.sas7bdat') as file:
    df = file.to_data_frame()


# Write to CSV:
df.to_csv('s3://your-bucket/output-folder/your-file.csv', index=False)

Run the Job: Execute the Glue job, and it will read the .sas7bdat file, convert it to a CSV, and save it back to S3.

Ensure that your IAM roles and permissions are correctly set up to allow AWS Glue to access the S3 buckets and perform the necessary operations.

If this has resolved your issue or was helpful, accepting the answer would be greatly appreciated. Thank you!

profile picture
专家
已回答 3 个月前
profile picture
专家
已审核 2 个月前
  • Thank you for the solution!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则