1 Answer
- Newest
- Most votes
- Most comments
4
Hello, check these approaches once may resolves
1. Use an AWS Lambda function to convert Excel sheets to CSV:
Lambda Function Setup:
- Use the pandas library to read Excel sheets and convert them to CSV.
- Store each CSV in an S3 bucket.
import boto3
import pandas as pd
from io import BytesIO
s3 = boto3.client('s3')
def lambda_handler(event, context):
bucket = 'your-s3-bucket'
key = 'path/to/your/file.xlsx'
# Download Excel file from S3
obj = s3.get_object(Bucket=bucket, Key=key)
excel_data = obj['Body'].read()
# Load Excel file
xls = pd.ExcelFile(BytesIO(excel_data))
# Convert each sheet to CSV and upload back to S3
for sheet_name in xls.sheet_names:
df = pd.read_excel(xls, sheet_name=sheet_name)
csv_buffer = BytesIO()
df.to_csv(csv_buffer, index=False)
s3.put_object(Bucket=bucket, Key=f'path/to/csv/{sheet_name}.csv', Body=csv_buffer.getvalue())
2. Use AWS Glue Crawler to crawl the CSV files:
1.Create an AWS Glue Crawler:
- Set the S3 path where the CSV files are stored.
- Configure the crawler to create separate tables for each CSV file.
- Run the crawler to update the Glue Data Catalog.
- Query the tables in Athena:
- After the crawler runs, the tables for each sheet will be available in the Glue Data Catalog.
- Use Athena to query the tables.
Relevant content
- asked 2 years ago
- asked 6 months ago
- asked 10 months ago
- AWS OFFICIALUpdated a year ago

How do I configure Crawler to create separate tables for each CSV file?
When setting up the AWS Glue Crawler, just make sure to specify the S3 path where your CSV files are stored and enable the option to create separate tables for each CSV file. After running the crawler, you'll have individual tables in the Glue Data Catalog for each CSV file, ready to use in Athena.