Can multiple glue crawlers run concurrently?

0

Customer reports that Crawler is single threaded, only one can run at a time. Is it correct?

What's the best practice to setup crawler? customer has 10s of dataset that need to be crawled frequently? how could they run multiple one concurrently to finish schema detection or data change detection quickly?

Thanks.

중재자
질문됨 5년 전1698회 조회
1개 답변
0
수락된 답변

Each individual Crawler is single threaded, you can't execute the same Crawler more than one at a time. You can have up to 50 crawlers per account as the default limit, and I've had multiple unique crawlers execute at the same time.

https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html#limits_glue

Crawler setup details are documented here https://docs.aws.amazon.com/glue/latest/dg/crawler-configuration.html How you choose to configure your crawler will depend on what the customer is trying to achieve.

Another thing to note is, you can have a single crawler crawl multiple input data stores. If you had many separate data stores, but they all needed crawling at the same time/frequency, you can have glue either combine them into a single schema (in some circumstances), or multiple schemas (if they're unique).

AWS
답변함 5년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠