Querying rows containing GZipped and Base64 encoded string in Athena

0

I have a setup where I am using Athena to query data exported from DynamoDB table. I am using DDB export to S3 capability as mentioned in this blog post.

The table that I am querying contains some rows which are first Gzipped and then base64 encoded before writing to DDB. I am looking for some best practices on how can I make this data available for query in Athena?

Two options that I can think of are

  1. Transforming file in S3 once they are exported to base64 decode and uncompress the data.
  2. Write an Athena UDF to do the same during query execution.

Any recommendation on what might be the best option for such use-case? Is there something in built in Glue transformation that I can take advantage of?

Saket
已提問 2 年前檢視次數 799 次
1 個回答
0

Probably makes sense to try from_base64() built-in function that Athena supports

AWS
Alex_T
已回答 2 年前
  • from_base64() will give me the zipped binary. I also need ability to unzip the data. Any built-in function to do that?

  • I see, apologies, i thought you have data in gz file and one of columns is base64 encoded. there's no built-in function in Athena that would extract the data in this case. If you plan to query that data multiple times i suggest to process that data before you query it with Athena.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南