Sagemaker Studio notebook instances restricted to 64 megabytes not allow to train Pytorch multiprocess

0

Sagemaker Studio notebook instances restricted to 64 megabytes not allow to train Pytorch multiprocess with the default dataloaders. How can I add more capacity to /dev/shm or what kernel can I use to train with Pytorch multiprocess?

uname -a
Linux tensorflow-2-3-gpu--ml-g4dn-xlarge-33edf42bcb5531c041d8b56553ba 4.14.231-173.361.amzn2.x86_64 #1 SMP Mon Apr 26 20:57:08 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
df -h | grep -E 'shm|File'
Filesystem Size Used Avail Use% Mounted on
shm 64M 0 64M 0% /dev/shm
已提問 2 年前檢視次數 525 次
1 個回答
0

This is being tracked in the GitHub issue linked below.

A possible workaround is to use a regular Notebook Instance instead of a Studio Notebook Instance. On a regular Notebook Instance of the same size (ml.g4dn.xlarge), /dev/shm is 7.7G

df -h | grep -E 'shm|File'
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           7.7G     0  7.7G   0% /dev/shm
支援工程師
Peter_X
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南