partitions disappearing

0

I have several partitioned tables and have noticed that after a period of time queries return no data and I need to rerun MSCK REPAIR TABLE to make them visible again.

The process I have in place runs ADD PARTITIONS when new data is added to S3. This works for a period of time, but eventually I see queries return 0 rows and am forced to run the repair.

The data partition key "PARTITIONED BY (v_id string, x_id string, year int, month int, day int, hour int)". Some tables are CSV and others are AVRO, I've seen this on both.

Any suggestions? I could automate MSCK REPAIR, but that seems like the wrong solution.

Edited by: lettermuckoo on Dec 18, 2019 6:13 AM

已提问 4 年前245 查看次数
1 回答
0

I found the issue.

There was a job that was recreating the tables during deploys. MSCK REPAIR TABLE was being run after recreate, but it was not fully qualifying the database.tablename, so it was not discovering the existing partitions.

Edited by: lettermuckoo on Dec 18, 2019 1:56 PM

已回答 4 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则