Can SageMaker notebook jobs access studio storage?

0

I'm using SageMaker Studio, and I have my data files as well as a requirements.txt organized under my home directory. All works fine when I run notebook kernels interactively: they can access my files just fine. However, when I create a "notebook job", it doesn't seem to have access to any of my files. Is there a way to give my notebook job access to the same file system as my interactive notebooks?

After I run a job, I see that a folder for the job was created within the input S3 bucket, and within that folder there's a "input/" subfolder. But I don't know how to predict the name of the temp folder created for the job, so it doesn't seem like I could myself drop additional inputs in there, even if I wanted to. And if I could, how would I find them, at run-time?

Could sure use guidance on how my notebook jobs can access input files.

Thanks,

Chris

  • I tried creating an explicit inputs folder in the S3 bucket, created and populated various subfolders in there, and then specified that URI as the inputs S3 URI. However, SageMaker still created a temp folder within that URI, with its own "input" subfolder, in which it put the notebook and initialization script. So it doesn't look like I can proactively stage inputs in S3, given that the input folder is always created dynamically, within a temp folder created for the job.

질문됨 일 년 전805회 조회
1개 답변
0
수락된 답변

Hi Chris, the option to use input files is to directly use the S3 URIs in the notebook itself, i.e., instead of reading from an inputs folder in your local EFS storage (which doesn't get copied over to inputs folder for the training job), read the inputs directly from the S3 URI. If the inputs will be dynamic for your notebook jobs, use parameterized executions (reference - https://docs.aws.amazon.com/sagemaker/latest/dg/notebook-auto-run-troubleshoot-override.html)

AWS
Durga_S
답변함 일 년 전
profile picture
전문가
검토됨 14일 전
  • Thank you for the answer! So just to be clear, I'd suck all input files down from S3 at the start of the notebook, essentially using ephemeral storage that's specific to the job? And I presume that storage is truly job-specific and cleaned up at the end of the job?

  • Exactly! Notebook executions runs a training job, so the compute and storage is ephemeral.

  • Okay great, thanks again!

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠