Can multiple glue crawlers run concurrently?

0

Customer reports that Crawler is single threaded, only one can run at a time. Is it correct?

What's the best practice to setup crawler? customer has 10s of dataset that need to be crawled frequently? how could they run multiple one concurrently to finish schema detection or data change detection quickly?

Thanks.

MODERATOR
gefragt vor 5 Jahren1698 Aufrufe
1 Antwort
0
Akzeptierte Antwort

Each individual Crawler is single threaded, you can't execute the same Crawler more than one at a time. You can have up to 50 crawlers per account as the default limit, and I've had multiple unique crawlers execute at the same time.

https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html#limits_glue

Crawler setup details are documented here https://docs.aws.amazon.com/glue/latest/dg/crawler-configuration.html How you choose to configure your crawler will depend on what the customer is trying to achieve.

Another thing to note is, you can have a single crawler crawl multiple input data stores. If you had many separate data stores, but they all needed crawling at the same time/frequency, you can have glue either combine them into a single schema (in some circumstances), or multiple schemas (if they're unique).

AWS
beantwortet vor 5 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen