Amazon Q Webcrawler Error: Exception in Starting Crawler Threads with Sitemap Datasource

0

Hi everyone,

I've recently encountered an issue while using Amazon Q's webcrawler datasource and was hoping to get some insights or solutions from the community. I'm currently working with a Sitemap datasource and have run into a specific error that's proving to be quite puzzling.

Error Message: The issue I'm facing is an error message that reads "Exception occurred while starting crawler threads." This error appears in the logs, and I'm trying to understand what it means and how to resolve it.

Observations:

  1. The webcrawler can connect to and read the sitemap XML. I have verified this through CloudWatch logs.
  2. Despite the successful connection, the crawler only processes about 5 pages and then stops abruptly.
  3. The sitemap in question contains over 100 pages, so it's unclear why the crawler is stopping at 5.
  4. Notably, the error message appears in the logs before the 5 pages are crawled.

Questions:

  1. Has anyone else encountered this specific error?
  2. What could be causing the "Exception occurred while starting crawler threads" error, particularly when the crawler seems capable of initiating the crawl?
  3. Are there any known limitations or configurations in Amazon Q that I might be overlooking which could lead to this issue?

Any insights, experiences, or suggestions you could share would be greatly appreciated. I'm keen to understand this error better and find a solution to ensure the crawler can process all the pages in the sitemap as intended.

Thanks in advance for your help!

jftsg
asked 5 months ago85 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions