Querying rows containing GZipped and Base64 encoded string in Athena

0

I have a setup where I am using Athena to query data exported from DynamoDB table. I am using DDB export to S3 capability as mentioned in this blog post.

The table that I am querying contains some rows which are first Gzipped and then base64 encoded before writing to DDB. I am looking for some best practices on how can I make this data available for query in Athena?

Two options that I can think of are

  1. Transforming file in S3 once they are exported to base64 decode and uncompress the data.
  2. Write an Athena UDF to do the same during query execution.

Any recommendation on what might be the best option for such use-case? Is there something in built in Glue transformation that I can take advantage of?

Saket
質問済み 2年前799ビュー
1回答
0

Probably makes sense to try from_base64() built-in function that Athena supports

AWS
Alex_T
回答済み 2年前
  • from_base64() will give me the zipped binary. I also need ability to unzip the data. Any built-in function to do that?

  • I see, apologies, i thought you have data in gz file and one of columns is base64 encoded. there's no built-in function in Athena that would extract the data in this case. If you plan to query that data multiple times i suggest to process that data before you query it with Athena.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ