Querying rows containing GZipped and Base64 encoded string in Athena

0

I have a setup where I am using Athena to query data exported from DynamoDB table. I am using DDB export to S3 capability as mentioned in this blog post.

The table that I am querying contains some rows which are first Gzipped and then base64 encoded before writing to DDB. I am looking for some best practices on how can I make this data available for query in Athena?

Two options that I can think of are

  1. Transforming file in S3 once they are exported to base64 decode and uncompress the data.
  2. Write an Athena UDF to do the same during query execution.

Any recommendation on what might be the best option for such use-case? Is there something in built in Glue transformation that I can take advantage of?

Saket
질문됨 2년 전801회 조회
1개 답변
0

Probably makes sense to try from_base64() built-in function that Athena supports

AWS
Alex_T
답변함 2년 전
  • from_base64() will give me the zipped binary. I also need ability to unzip the data. Any built-in function to do that?

  • I see, apologies, i thought you have data in gz file and one of columns is base64 encoded. there's no built-in function in Athena that would extract the data in this case. If you plan to query that data multiple times i suggest to process that data before you query it with Athena.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠