How/where is data copied in sagemaker pipelines?

0

example contrived for this question, I understand , when we create a sagemaker pipeline, with steps to process data and then to run training, the data is copied a local instance in some opt/ml/.. directory i would assume. this is s3 file mode for training. so if i pull down the data manually from notebok or terminal and copy it to wherever sagemaker wants. when i run the pipeline, how can i tell sagemaker that the data is already present in the local instance such that it doesn't have to download from the s3 uri ?

1 Answer
0

From the question above what I understood is you have already copied the data to local instance and would like avoid coping it with each pipeline run.

With SageMaker SDK local mode, you can also specify a local path instead s3 url, the local files/dataset will be used instead of downloading the files from S3.

This documentation shows how to specify local mode and input - https://sagemaker.readthedocs.io/en/stable/overview.html#local-mode

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions