By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Using Glue Crawler to crawl parquet files


Created a crawler to crawl parquet files residing in S3. What I've learnt is that parquet format is in-built in crawler so there's no explicit configuration needed. However, I do not see any tables created when I run the crawler, which runs successfully. Has anyone done this and is there any special configuration needed for this?

asked a year ago1.7K views
1 Answer

Based on what's described here, it seems there is no error in cloudwatch logs for crawler.

Can you please make sure that there are no access denied errors. Also see if the role attached to crawler has access to that s3 path and s3 bucket KMS key if SSE-KMS CMK is the bucket encryption. Can you create a new role with required permissions and attach it to crawler and see if behavior changes. Verify that there are no explicit deny policies at bucket or KMS key level.

Also, I'd see if there are any already existing tables, which related to this data(files) in same database, as in that case also, you may not see new tables created.

Comment here how it goes, happy to assist further.


profile pictureAWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions