What Happens When I Insert a Row into an Athena Database Table?

0

My understanding of Athena is it presents a database like view for files stored in S3 bucket. Am I correct? So, what happens when I insert or modify a row (or multiple rows) in an Athena table via the AWS query editor? I tried it, and it seems the row insertion worked, as in, when I query the table again, it can show me the row I inserted. But it seems the underlying file itself (a csv) did not change. So is there any publicly available documentation on upsert behaviour on Athena and its interaction with the underlying S3 files?

2 réponses
2
Réponse acceptée

Hello.

I don't think Athena's data insert writes to the original file.
If you look at S3, you should probably see that a new file has been created.
https://docs.aws.amazon.com/athena/latest/ug/insert-into.html

Athena writes files to source data locations in Amazon S3 as a result of the INSERT command. Each INSERT operation creates a new file, rather than appending to an existing file. The file locations depend on the structure of the table and the SELECT query, if present. Athena generates a data manifest file for each INSERT query. The manifest tracks the files that the query wrote. It is saved to the Athena query result location in Amazon S3. For more information, see Identifying query output files.

profile picture
EXPERT
répondu il y a 5 mois
profile picture
EXPERT
vérifié il y a 5 mois
profile picture
EXPERT
Kallu
vérifié il y a 5 mois
0

Just to address the Update/Upsert question, that is where you will need to use one of the open datalake formats, e.g. Iceberg, Hudi, or DeltaLake.

répondu il y a 5 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions