- Newest
- Most votes
- Most comments
Synchronization times in Kendra can vary based on a number of factors, such as:
- Number and size of documents: Syncing a large number of big documents will take longer than a small number of short documents.
- Initial sync vs incremental sync: The initial full sync of an entire S3 bucket into Kendra will take longer than subsequent incremental syncs of updates/additions.
- Load on Kendra: If there are many indexes syncing data simultaneously, it may take more time compared to low usage periods.
- Index size: Bigger indexes with more data, users and query capacity take longer to update than smaller indexes.
As a rough estimate, it can take anywhere from minutes to hours for Amazon Kendra to fully sync contents from an S3 bucket depending on the above factors. I recommend checking the status in the AWS console to see if a sync has completed (see methods below).
For both the Developer and Enterprise tiers of Kendra, there is a $0.35 per hour connector charge when synchronization is being performed, and if this is marked as "COMPLETE" via either of the methods below, you will not be charged for connector usage.
To monitor the sync status of your Kendra index, you can use the Amazon Kendra console or AWS CLI.
Using the console:
- Sign in to the AWS Management Console and open the Amazon Kendra console (link).
- From the list of indexes, select the index you want to monitor.
- Choose the "Data sources" option from the left menu.
- Select the data source and scroll down to view the sync run history and metrics.
This will show you details of previous syncs like start/end time, number of documents added/deleted/failed. You can also view the total document count indexed from that source.
Using the AWS CLI:
aws kendra list-data-source-sync-jobs --id <data-source-id> --index-id <index-id> --region <region>
Where:
<data-source-id>
is the ID of the data source<index-id>
is the ID of the Kendra index<region>
is the AWS region
This will list the status of ongoing or recent sync jobs for the given data source.
Sources
[1] [Monitoring your index (console) - Amazon Kendra] (https://docs.aws.amazon.com/kendra/latest/dg/monitoring-runsync.html)
[2] [Creating an Amazon Kendra index and ingesting the metadata - Amazon Kendra] (https://docs.aws.amazon.com/kendra/latest/dg/tutorial-search-metadata-create-index-ingest.html)
Please accept answer if helpful!
Kendra scan of 20 text files, each 1kB, took 8 mins in my case. I scanned the same source a few times and the scan time is consistent.
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago
can it be an issue with the sync job? as the data is no more than 20 MB, There are no parallel sync jobs, kendra do not have any prior data