Why does my kernel keep restarting when I try to download pre-trained Hugging Face BERT models weights to Amazon SageMaker?

0

When I try to download the pre-trained Hugging Face BERT models weights to the conda_pytorch_p36 kernel of my Amazon SageMaker Notebook instance using the following command, the kernel always restarts:

PRE_TRAINED_MODEL_NAME2='sshleifer/distilbart-cnn-12-6'
model2 = BartForConditionalGeneration.from_pretrained(PRE_TRAINED_MODEL_NAME2, cache_dir='hf_cache_dir/')

Note I have installed following libraries using pip commands.

!pip install transformers==4.17.0

The result is the same for Hugging Face "facebook/bart-large-cnn" models.

Why is this happening, and how do I resolve the issue?

  • maybe it is possible that you overrun memory?

preguntada hace 2 años564 visualizaciones
1 Respuesta
0

This typically happens when there's high resource utilization on the notebook instance and increasing instance type may help. Additionally, I would suggest that you open a support case under SageMaker queue by providing Sagemaker notebook ARN and associated Cloudwatch logs recorded for this notebook so that a support engineer can further troubleshoot the issue.

AWS
respondido hace 2 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas