OOM when generating embedding in Jupyter Lab

0

The notebook instance is ml.m5d.2xlarge with 32GB of memory. However, we are encountering some OOM errors in SageMaker Notebooks when generating embeddings with Tensorflow:

tf_hub_embedding_layer = hub.KerasLayer("https://tfhub.dev/google/universal-sentence-encoder/4",
                                        trainable=False,
                                        name="universal_sentence_encoder")
embeddings = tf_hub_embedding_layer(lr_cleaned_df.text.values)

When retrieving the universal sentence encoder, the virtual memory size is about 7GB. The size of lr_cleaned_df.text.values contains 60K snippet of texts and is about 122MB in memory.

Is there a default allocation of memory to a Jupyter notebook? If so, can this be overwritten?

질문됨 2년 전396회 조회
1개 답변
0

Hi there,

I was able to reproduce this behavior on a ml.m5d.2xlarge notebook instance using similar code.

tf_hub_embedding_layer = hub.KerasLayer("https://tfhub.dev/google/universal-sentence-encoder/4",
                                        trainable=False,
                                        name="universal_sentence_encoder")
embeddings = tf_hub_embedding_layer(train_examples)

In my case, I was able to run it with 25K lines of text. However, when I ran it with 50K lines of text (train_examples.repeat(2)), I also experienced OOM errors. Running free -h in terminal also showed that the notebook instance did in fact run out of free memory while running the code above, and hence the OOM errors.

              total        used        free      shared  buff/cache   available
Mem:            30G         22G        900M        676K        7.2G        7.6G
Swap:            0B          0B          0B

In order to run code similar to this, please consider choosing a bigger instance size with more memory.

지원 엔지니어
Peter_X
답변함 2년 전
  • Hi @Peter_X, I ended up running the experiment on a ml.md5.4xlarge instance and was successful. Having said that, it does not answer the question whether the allocation of memory to a Jupyter Notebook (or kernel) can be configured.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠