What Happens When I Insert a Row into an Athena Database Table?

0

My understanding of Athena is it presents a database like view for files stored in S3 bucket. Am I correct? So, what happens when I insert or modify a row (or multiple rows) in an Athena table via the AWS query editor? I tried it, and it seems the row insertion worked, as in, when I query the table again, it can show me the row I inserted. But it seems the underlying file itself (a csv) did not change. So is there any publicly available documentation on upsert behaviour on Athena and its interaction with the underlying S3 files?

已提問 5 個月前檢視次數 401 次
2 個答案
2
已接受的答案

Hello.

I don't think Athena's data insert writes to the original file.
If you look at S3, you should probably see that a new file has been created.
https://docs.aws.amazon.com/athena/latest/ug/insert-into.html

Athena writes files to source data locations in Amazon S3 as a result of the INSERT command. Each INSERT operation creates a new file, rather than appending to an existing file. The file locations depend on the structure of the table and the SELECT query, if present. Athena generates a data manifest file for each INSERT query. The manifest tracks the files that the query wrote. It is saved to the Athena query result location in Amazon S3. For more information, see Identifying query output files.

profile picture
專家
已回答 5 個月前
profile picture
專家
已審閱 5 個月前
profile picture
專家
Kallu
已審閱 5 個月前
0

Just to address the Update/Upsert question, that is where you will need to use one of the open datalake formats, e.g. Iceberg, Hudi, or DeltaLake.

已回答 5 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南