OOM when generating embedding in Jupyter Lab

0

The notebook instance is ml.m5d.2xlarge with 32GB of memory. However, we are encountering some OOM errors in SageMaker Notebooks when generating embeddings with Tensorflow:

tf_hub_embedding_layer = hub.KerasLayer("https://tfhub.dev/google/universal-sentence-encoder/4",
                                        trainable=False,
                                        name="universal_sentence_encoder")
embeddings = tf_hub_embedding_layer(lr_cleaned_df.text.values)

When retrieving the universal sentence encoder, the virtual memory size is about 7GB. The size of lr_cleaned_df.text.values contains 60K snippet of texts and is about 122MB in memory.

Is there a default allocation of memory to a Jupyter notebook? If so, can this be overwritten?

asked 2 years ago377 views
1 Answer
0

Hi there,

I was able to reproduce this behavior on a ml.m5d.2xlarge notebook instance using similar code.

tf_hub_embedding_layer = hub.KerasLayer("https://tfhub.dev/google/universal-sentence-encoder/4",
                                        trainable=False,
                                        name="universal_sentence_encoder")
embeddings = tf_hub_embedding_layer(train_examples)

In my case, I was able to run it with 25K lines of text. However, when I ran it with 50K lines of text (train_examples.repeat(2)), I also experienced OOM errors. Running free -h in terminal also showed that the notebook instance did in fact run out of free memory while running the code above, and hence the OOM errors.

              total        used        free      shared  buff/cache   available
Mem:            30G         22G        900M        676K        7.2G        7.6G
Swap:            0B          0B          0B

In order to run code similar to this, please consider choosing a bigger instance size with more memory.

SUPPORT ENGINEER
Peter_X
answered 2 years ago
  • Hi @Peter_X, I ended up running the experiment on a ml.md5.4xlarge instance and was successful. Having said that, it does not answer the question whether the allocation of memory to a Jupyter Notebook (or kernel) can be configured.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions