- Neueste
- Die meisten Stimmen
- Die meisten Kommentare
It seems you're encountering an issue with the AttributeFilter in your Kendra query. There are a few things to consider:
-
API Difference: You're using the Retrieve API, but the AttributeFilter is typically used with the Query API. The Retrieve API has different parameters and functionality compared to the Query API.
-
Attribute Key: The correct attribute key for the document URI is typically '_source_uri' or 'SourceURI', so your approach there is correct.
-
Filter Operation: Instead of 'ContainsAny', you might want to use 'Contains' since you're looking for a specific string within the URI.
-
Value Type: For a single string value, use 'StringValue' instead of 'StringListValue'.
Here's a suggested modification to your code, using the Query API instead of Retrieve:
response = client.query( IndexId=KENDRA_INDEX_ID, QueryText=query, PageSize=page_size, PageNumber=page_number, AttributeFilter={ 'Contains': { 'Key': '_source_uri', 'Value': {'StringValue': 'client-138'} } } )
If you specifically need to use the Retrieve API, note that it might not support the same filtering capabilities as the Query API. In that case, you may need to perform the filtering on the client side after retrieving the results.
Also, ensure that the documents in your index have the correct metadata for the '_source_uri' field and that it contains the value you're filtering for. If the filtering still doesn't work, you might want to check your index configuration and verify that the attribute you're filtering on is properly set up and indexed.
Sources
Choose specific documents from Kendra index for RAG | AWS re:Post
Query - Amazon Kendra API Reference
Filtering and facet search - Amazon Kendra
The reason 'ContainsAny' filter is not working because it works only with 'StringList' type and '_source_uri' is reserved index filed of 'String' type. Therefore, you can try alternative approach of filtering by custom logic and a sample logic is shared below:
response = kendra.query(
IndexId=INDEX_ID,
QueryText=QUERY_TEXT,
PageSize=100,
RequestedDocumentAttributes=['_source_uri']
)
# Filter client-138
filtered_docs = [
item for item in response['ResultItems']
if any(
attr['Key'] == '_source_uri' and
'/client-138/' in attr['Value'].get('StringValue', '')
for attr in item.get('DocumentAttributes', [])
)
]
As you mentioned earlier, Filter Operation: Instead of using 'ContainsAny', you suggested using 'Contains' since we're trying to match a specific substring within the URI.
The "name" attribute is configured in the index with the following settings: Facetable: true, Searchable: true, Displayable: true, and Sortable: true. Here's the query I attempted:
Query parameters: { "IndexId": "a9826c71-b19f-48af-8420-402613d9a91d", "PageSize": 100, "QueryText": "*", "AttributeFilter": { "Contains": { "Key": "name", "Value": { "StringValue": "john" } } } }
However, this resulted in an error: Error querying Kendra (query): Parameter validation failed: Unknown parameter in AttributeFilter: "Contains", must be one of: AndAllFilters, OrAllFilters, NotFilter, EqualsTo, ContainsAll, ContainsAny, GreaterThan, GreaterThanOrEquals, LessThan, LessThanOrEquals
I've also checked the official documentation, and it confirms that "Contains" is not a valid filter operation in Kendra.
Could you please advise how I can achieve a partial string match (e.g., filter where "name", contains "john") given that the attribute is a StringValue?
Relevanter Inhalt
- AWS OFFICIALAktualisiert vor 2 Jahren

'Contains' is not a supported parameter