ModuleNotFoundError when starting a training job on Sagemaker

0

I want to submit a training job on sagemaker. I tried it on notebook and it works. When I try the following I get ModuleNotFoundError: No module named 'nltk'

My code is

import sagemaker  
from sagemaker.pytorch import PyTorch

JOB_PREFIX   = 'pyt-ic'
FRAMEWORK_VERSION = '1.3.1'

estimator = PyTorch(entry_point='finetune-T5.py',
                   source_dir='../src',
                   train_instance_type='ml.p2.xlarge' ,
                   train_instance_count=1,
                   role=sagemaker.get_execution_role(),
                   framework_version=FRAMEWORK_VERSION, 
                   debugger_hook_config=False,  
                   py_version='py3',
                   base_job_name=JOB_PREFIX)

estimator.fit()

finetune-T5.py have many other libraries that are not installed. How can I install the missing library? Or is there a better way to run the training job?

asked 2 years ago370 views
1 Answer
0
Accepted Answer

Check out this link (Using third-party libraries section) on how to install third-party libraries for training jobs. You need to create requirement.txt file in the same directory as your training script to install other dependencies at runtime.

Sam
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions