Sagemaker Studio notebook instances restricted to 64 megabytes not allow to train Pytorch multiprocess

0

Sagemaker Studio notebook instances restricted to 64 megabytes not allow to train Pytorch multiprocess with the default dataloaders. How can I add more capacity to /dev/shm or what kernel can I use to train with Pytorch multiprocess?

uname -a
Linux tensorflow-2-3-gpu--ml-g4dn-xlarge-33edf42bcb5531c041d8b56553ba 4.14.231-173.361.amzn2.x86_64 #1 SMP Mon Apr 26 20:57:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
df -h | grep -E 'shm|File'
Filesystem Size Used Avail Use% Mounted on
shm 64M 0 64M 0% /dev/shm
asked 10 months ago175 views
1 Answer
0

This is being tracked in the GitHub issue linked below.

A possible workaround is to use a regular Notebook Instance instead of a Studio Notebook Instance. On a regular Notebook Instance of the same size (ml.g4dn.xlarge), /dev/shm is 7.7G

df -h | grep -E 'shm|File'
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           7.7G     0  7.7G   0% /dev/shm
SUPPORT ENGINEER
Peter_X
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions