Querying rows containing GZipped and Base64 encoded string in Athena

0

I have a setup where I am using Athena to query data exported from DynamoDB table. I am using DDB export to S3 capability as mentioned in this blog post.

The table that I am querying contains some rows which are first Gzipped and then base64 encoded before writing to DDB. I am looking for some best practices on how can I make this data available for query in Athena?

Two options that I can think of are

  1. Transforming file in S3 once they are exported to base64 decode and uncompress the data.
  2. Write an Athena UDF to do the same during query execution.

Any recommendation on what might be the best option for such use-case? Is there something in built in Glue transformation that I can take advantage of?

Saket
demandé il y a 2 ans802 vues
1 réponse
0

Probably makes sense to try from_base64() built-in function that Athena supports

AWS
Alex_T
répondu il y a 2 ans
  • from_base64() will give me the zipped binary. I also need ability to unzip the data. Any built-in function to do that?

  • I see, apologies, i thought you have data in gz file and one of columns is base64 encoded. there's no built-in function in Athena that would extract the data in this case. If you plan to query that data multiple times i suggest to process that data before you query it with Athena.

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions