Avoid metadata from Athena with Boto 3

2

I'm trying to schedule a data transformation with Athena using python and boto 3 (via glue). Once the query is launched, the data should be stored at an S3 sub-bucket.

I need the subbucket to have just the data, but the query creates a metadata file. I didn't find a way to avoid the query to create the metadata file I'm using the start_query_execution from boto 3 to run the query:

queryStart = client.start_query_execution(
    QueryString = query,
    QueryExecutionContext = {
        'Database': database
    }, 
    ResultConfiguration = { 'OutputLocation': 's3://' + bucket + '/' + subbucketpath}
)

I tried with the below function to remove the metadata file

s3 = session.resource('s3')
my_bucket = s3.Bucket(bucket)
for item in my_bucket.objects.filter(Prefix=subbucketpath):
      if item.endswith('.csv.metadata'):
            item.delete()

but it gives an error: AttributeError: 's3.ObjectSummary' object has no attribute 'endswith'.

Is there any other way to launch the Athena query from Glue or to remove the '.csv.metadata' files?

preguntada hace 2 años317 visualizaciones
1 Respuesta
0

Hi,

Athena automatically creates metadata files when it moves files using the start_query_execution command. In order to delete the .csv.metadata files, you can use the following logic below. Make sure to use item.key to get the name of the object. The try statement will skip over the s3.ObjectSummary object that is giving this error.

session = boto3.session.Session()
s3 = session.resource('s3')
my_bucket = s3.Bucket(<bucketname>)
for item in my_bucket.objects.filter(Prefix=<subbucketpath>):
      try:
            if item.key.endswith('.csv.metadata'):
                  item.delete()
      except Exception as e:
            print("The following error occured: {}".format(e))

Reference: https://docs.aws.amazon.com/athena/latest/ug/querying.html

AWS
respondido hace 9 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas