Passer au contenu

Kendra Sync Indexing taking more than 2 hours and still not finished

0

I have few files in S3 about 5-10 files with total size of about 15 MB. My kendra data source started sync and after 2 hours it si still in indexing state and not completing. How much time is it expected to take? Is it normal. I have added administrator access permission to my role in order to have kendra data source indexing permissions. I am following this blog:

https://aws.amazon.com/blogs/machine-learning/quickly-build-high-accuracy-generative-ai-applications-on-enterprise-data-using-amazon-kendra-langchain-and-large-language-models/

  • I scanned about 20 text files (less than 1kB each) and Kendra scan took about 8 mins each time.

1 réponse
0

According to https://docs.aws.amazon.com/kendra/latest/dg/troubleshooting-data-sources.html:

If there are no updates to documents, sync time for a Amazon Kendra index increases in linear proportion to the number of documents. For example, 1,000 documents without any updates would take about five minutes to sync and 2,000 documents without any updates will take about 10 minutes. If there are any updates to the documents, then the sync time will increase based on the number of documents updated.

So, it looks that there are some problems with the syncing (maybe, with the access to the sources). You can have a look at https://docs.aws.amazon.com/kendra/latest/dg/iam-roles.html if you need help in creating/updating the policies to grant access to S3 from Kendra.

AWS
répondu il y a 2 ans
EXPERT
vérifié il y a 2 ans
  • I have just a few PDF files in S3. One thing is that I have not added any metadata along with the files. In documentation of Kendra with S3 as data source it says meta data is optional. Can it be due to metadata issue? and also related to IAM roles I have tried by attaching administrator role for testing as well.

  • No, I don't think it's an issue of the metadata. Have you followed these instructions https://docs.aws.amazon.com/kendra/latest/dg/create-ds-s3.html?

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.