partitions disappearing

0

I have several partitioned tables and have noticed that after a period of time queries return no data and I need to rerun MSCK REPAIR TABLE to make them visible again.

The process I have in place runs ADD PARTITIONS when new data is added to S3. This works for a period of time, but eventually I see queries return 0 rows and am forced to run the repair.

The data partition key "PARTITIONED BY (v_id string, x_id string, year int, month int, day int, hour int)". Some tables are CSV and others are AVRO, I've seen this on both.

Any suggestions? I could automate MSCK REPAIR, but that seems like the wrong solution.

Edited by: lettermuckoo on Dec 18, 2019 6:13 AM

asked 4 years ago238 views
1 Answer
0

I found the issue.

There was a job that was recreating the tables during deploys. MSCK REPAIR TABLE was being run after recreate, but it was not fully qualifying the database.tablename, so it was not discovering the existing partitions.

Edited by: lettermuckoo on Dec 18, 2019 1:56 PM

answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions