AWS Glue Crawler errors out trying to find partitionValues.latest in a Delta Lake

0

I'm seeing errors like the following when trying to get a crawler to crawl a non-native Delta Lake S3 folder i have:

WARN : Cannot get schema or partition columns or partition values for Delta table: BUCKET/PATH, got exception: com.amazonaws.services.glue.exceptions.S3NoSuchKeyException: No object found for bucket: glue-dataplane-prod-us-east-1-state-tree-v2 key: d0d989b0-e5e5-4233-a4a1-286ecdee15b2/file_schemas/BUCKET/PATH/partitionValues.latest

And it's correct - there's no partitionValues.latest file in the delta lake folder. But I don't know what that file is, and I've never seen it before in my delta lakes. I also don't know what the uuid/file_schemas bit is about.

I have other delta lakes that work fine without this file, using an identical (afaict) crawler setup. Even this crawler I can kind of sometimes get to work. It worked once on a delta lake without this file, but will give the error for that same delta lake if I have the crawler crawl another delta lake and the one that worked -- all of a sudden neither delta lake past muster.

Have other folks seen this error? Is it used in a certain version of the delta lake spec I'm not using?

(I'm creating these lakes using the delta-spark python package, version 2.3.0 -- the latest at the time of writing.)

Thanks for any tips.

mikix
질문됨 일 년 전93회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠