New Partition Availability

0

Let's say that I am executing an INSERT INTO statement in Athena that is writing new partitions. When are those new partitions available to be queried in a SELECT query? Is it possible to run a SELECT statement while that INSERT INTO is running on the same Glue Catalog table and get partial data from the new partition - or will that partition become available only after it is fully written?

1개 답변
0

new partitions will be visible for SELECTS after the metadata about them is available which will happen either after

MSCK REPAIR TABLE

or (more lightweight and therefore preferred)

ALTER TABLE ... ADD PARTITION

You can however add those partitions "in advance" even before any data is added for those, and in this case the data will be available to SELECT queries as soon as some of the files are added to those partitions by INSERT INTO SELECT

https://docs.aws.amazon.com/athena/latest/ug/msck-repair-table.html

https://docs.aws.amazon.com/athena/latest/ug/alter-table-add-partition.html

AWS
Alex_T
답변함 2년 전
  • If I don't ADD PARTITION in advance, and don't call MSCK REPAIR, is it still the case that "the data will be available to SELECT queries as soon as some of the files are added to those partitions by INSERT INTO SELECT"? That would essentially be the same as saying that in this scenario, the existence of a partition does not guarantee the corresponding INSERT INTO has finished writing the partition.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠