Avoid metadata from Athena with Boto 3

2

I'm trying to schedule a data transformation with Athena using python and boto 3 (via glue). Once the query is launched, the data should be stored at an S3 sub-bucket.

I need the subbucket to have just the data, but the query creates a metadata file. I didn't find a way to avoid the query to create the metadata file I'm using the start_query_execution from boto 3 to run the query:

queryStart = client.start_query_execution(
    QueryString = query,
    QueryExecutionContext = {
        'Database': database
    }, 
    ResultConfiguration = { 'OutputLocation': 's3://' + bucket + '/' + subbucketpath}
)

I tried with the below function to remove the metadata file

s3 = session.resource('s3')
my_bucket = s3.Bucket(bucket)
for item in my_bucket.objects.filter(Prefix=subbucketpath):
      if item.endswith('.csv.metadata'):
            item.delete()

but it gives an error: AttributeError: 's3.ObjectSummary' object has no attribute 'endswith'.

Is there any other way to launch the Athena query from Glue or to remove the '.csv.metadata' files?

asked 2 years ago279 views
1 Answer
0

Hi,

Athena automatically creates metadata files when it moves files using the start_query_execution command. In order to delete the .csv.metadata files, you can use the following logic below. Make sure to use item.key to get the name of the object. The try statement will skip over the s3.ObjectSummary object that is giving this error.

session = boto3.session.Session()
s3 = session.resource('s3')
my_bucket = s3.Bucket(<bucketname>)
for item in my_bucket.objects.filter(Prefix=<subbucketpath>):
      try:
            if item.key.endswith('.csv.metadata'):
                  item.delete()
      except Exception as e:
            print("The following error occured: {}".format(e))

Reference: https://docs.aws.amazon.com/athena/latest/ug/querying.html

AWS
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions