Skip to content

Kendra search not delivering any 'smart' results

0

I am currently discovering Kendra as as option to use for an internal search engine for our organization. Therefore I put several documents filled with texts about different subjects in an S3 bucket and included this one as a data source into a Kendra index. I tried out to search the indexed content through the search console and I am wondering if I forgot a step as the quality of the search is pretty poor and it doesn't seem like the search is currently using any ML/LLM features.

It is only returning search results if there are exact word matches in my search query and one of the text documents. The texts are in German and the data source and index are configured for German content. The search console is also set to find German content. One example: One text is a long article about dinosaurs. If I search for the word 'Dino' which is a common German abbreviation for 'Dinosaurier' it doesn't find a thing. Same if I describe the features of dinosaurs in an even more abstract way. Is this intended or did I miss to enable some feature here? If I index the same documents manually with an index DB like Pinecone or pgvector + Postgres I get way better semantic search capacities.

asked 8 months ago288 views
1 Answer
1

It sounds like you may be experiencing some limitations with the basic lexical search capabilities of Amazon Kendra, and you haven't yet enabled or configured some of the more advanced semantic search features.

By default, Kendra uses lexical search, which primarily matches exact words or phrases in the query to the indexed documents. This explains why you're only getting results for exact word matches and not seeing matches for related terms or concepts.

To improve your search results and leverage Kendra's more advanced capabilities, you might want to consider the following:

  1. Semantic search: Kendra offers semantic search capabilities that can help understand the intent behind queries and return more relevant results, even when exact word matches aren't present. This feature uses machine learning models to encode text into vectors, allowing for similarity matching based on meaning rather than just words.

  2. Custom synonyms: For cases like "Dino" and "Dinosaurier", you can set up custom synonyms in your Kendra index. This would allow Kendra to understand that these terms are related and return relevant results for both.

  3. Language-specific analyzers: Ensure that you're using the appropriate German language analyzer for your index. This can help with things like stemming and understanding German-specific linguistic features.

  4. Fine-tuning relevance: You may need to adjust the boosting of certain metadata fields or attributes to improve the ranking of your search results.

  5. Query expansion: Kendra can expand queries to include related terms, which might help in cases where you're describing dinosaur features without using the exact word.

If you've confirmed these features are properly set up and you're still not seeing improvements, you may want to review your indexing configuration to ensure all relevant documents are being properly processed. Additionally, consider using Kendra's Retrieve API for more advanced retrieval augmented generation (RAG) use cases, which can provide more semantically relevant passages for your queries.

Remember that building an effective search solution often requires iterative refinement and tuning to get the best results for your specific use case and content.
Sources
Retrieving responses from indexes in Amazon Kendra - Amazon Kendra
Amazon OpenSearch Service search enhancements: 2023 roundup | AWS Big Data Blog
Troubleshooting document search results - Amazon Kendra

answered 8 months ago
EXPERT
reviewed 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.