Does AWS train their own base model if I train a model derived from it?

1

Hey, my company wants to build an LLM using any base models available in Sagemaker/Bedrock. We want to be sure about this if we train a model of our own, then will our training data be sent to the base model also for their training? I know that my model gets deployed in an S3 bucket and it will be disconnected from any outside things unless I specify it in the policy, but it's just that I want the AWS documentation which states that. Just to be sure about it. Please can you help me with that?

1 Answer
1

No, your data will not be shared with base model provider. Same is mentioned in FAQ or Sagemaker Jumpstart and Bedrock

Security and Customisation section of Bedrock FAQs - https://aws.amazon.com/bedrock/faqs/

Q - Are user inputs and model outputs made available to third-party model providers?

No. Users inputs and model outputs are not shared with any model providers.

Q - How does Amazon Bedrock ensure my data used in fine-tuning remains private and confidential?

When you’re fine tuning a model, your data is never exposed to the public internet, never leaves the AWS network, is securely transferred through your VPC, and is encrypted in transit and at rest. And, Bedrock enforces the same AWS access controls that you have with any of our other services.

Foundation models section in Sagemaker Jumpstart FAQs - https://aws.amazon.com/sagemaker/faqs/

Q - Will my data be used or shared to update the base model that is offered to customers using SageMaker JumpStart?

No. Your inference and training data will not be used nor shared to update or train the base model that SageMaker JumpStart surfaces to customers.

answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions