Sagemaker Studio notebook instances restricted to 64 megabytes not allow to train Pytorch multiprocess

0

Sagemaker Studio notebook instances restricted to 64 megabytes not allow to train Pytorch multiprocess with the default dataloaders. How can I add more capacity to /dev/shm or what kernel can I use to train with Pytorch multiprocess?

uname -a
Linux tensorflow-2-3-gpu--ml-g4dn-xlarge-33edf42bcb5531c041d8b56553ba 4.14.231-173.361.amzn2.x86_64 #1 SMP Mon Apr 26 20:57:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
df -h | grep -E 'shm|File'
Filesystem Size Used Avail Use% Mounted on
shm 64M 0 64M 0% /dev/shm
질문됨 2년 전515회 조회
1개 답변
0

This is being tracked in the GitHub issue linked below.

A possible workaround is to use a regular Notebook Instance instead of a Studio Notebook Instance. On a regular Notebook Instance of the same size (ml.g4dn.xlarge), /dev/shm is 7.7G

df -h | grep -E 'shm|File'
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           7.7G     0  7.7G   0% /dev/shm
지원 엔지니어
Peter_X
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인