By using AWS re:Post, you agree to the Terms of Use

Why does my kernel keep restarting when I try to download pre-trained Hugging Face BERT models weights to Amazon SageMaker?


When I try to download the pre-trained Hugging Face BERT models weights to the conda_pytorch_p36 kernel of my Amazon SageMaker Notebook instance using the following command, the kernel always restarts:

model2 = BartForConditionalGeneration.from_pretrained(PRE_TRAINED_MODEL_NAME2, cache_dir='hf_cache_dir/')

Note I have installed following libraries using pip commands.

!pip install transformers==4.17.0

The result is the same for Hugging Face "facebook/bart-large-cnn" models.

Why is this happening, and how do I resolve the issue?

  • maybe it is possible that you overrun memory?

1 Answers

This typically happens when there's high resource utilization on the notebook instance and increasing instance type may help. Additionally, I would suggest that you open a support case under SageMaker queue by providing Sagemaker notebook ARN and associated Cloudwatch logs recorded for this notebook so that a support engineer can further troubleshoot the issue.

answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions