跳至內容

"Failed to create the Amazon Opensearch Serverless collection. Failed to fetch" when creating bedrock knowledge base with web crawler

0

data source -> web crawler

Regex include pattern -> none

content chunking and parsing -> default

embeddings model -> Embed english v3

vector database -> quick create a new vector store

policies attached to user:

  1. AmazonBedrockFullAccess

  2. Custom policies:

A. { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "aoss:CreateAccessPolicy", "aoss:CreateSecurityPolicy", "aoss:CreateCollection", "aoss:ListCollections", "aoss:BatchGetCollection", "aoss:UpdateCollection", "aoss:DeleteCollection", "aoss:ListAccessPolicies", "aoss:ListSecurityPolicies", "aoss:ListTagsForResource", "aoss:UpdateAccessPolicy", "aoss:GetSecurityPolicy", "aoss:UpdateSecurityPolicy", "iam:ListUsers", "iam:ListRoles" ], "Resource": "*" } ] }

B.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "bedrock:ListFoundationModels", "bedrock:ListCustomModels" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "bedrock:InvokeModel" ], "Resource": [ "arn:aws:bedrock:ap-southeast-1::foundation-model/amazon.titan-embed-text-v1", "arn:aws:bedrock:ap-southeast-1::foundation-model/cohere.embed-english-v3", "arn:aws:bedrock:ap-southeast-1::foundation-model/cohere.embed-multilingual-v3" ] } ] }

C. { "Version": "2012-10-17", "Statement": [ { "Sid": "MarketplaceBedrock", "Effect": "Allow", "Action": [ "aws-marketplace:ViewSubscriptions", "aws-marketplace:Unsubscribe", "aws-marketplace:Subscribe" ], "Resource": "*" } ] }

Please, badly need help from expert. I'm stuck on this one i don't know what to do anymore. what i'm missing with policies ?

'Preparing vector database in Amazon Opensearch Serverless. This process may take several minutes to complete.' then: 'Failed to create the Amazon Opensearch Serverless collection. Failed to fetch'

but upon checking collections in opensearch service, it was there, the only thing is the creation of knowledge base is not successful. just following a video -> https://www.youtube.com/watch?v=oSnFZhHuIgg

1 個回答
1

Hi,

To simplify your debugging, i suggest to start creating a Bedrock KB from S3 content (a bucket with 2-3 manually created files will be enough) as a first data source. That will allow you to configure all the Bedrock KB parameters properly. Then you can add the web crawler as a second direct datasource.

On you can even use your S3 bucket as a place where the crawler stores the crawled content then loaded to KB via a recurrent sync job. Storing the content in S3 will make your observability much better: if KB sync job complains about content to be parsed and vectorized, it's be much simpler to analyze.

A step by step description of KB creation with S3 is here: https://medium.com/@saikatm.courses/implementing-rag-app-using-knowledge-base-from-amazon-bedrock-and-streamlit-e52f8300f01d

Best,

Didier

專家
已回答 2 年前
專家
已審閱 2 年前
專家
已審閱 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。