ModuleNotFoundError when starting a training job on Sagemaker

0

I want to submit a training job on sagemaker. I tried it on notebook and it works. When I try the following I get ModuleNotFoundError: No module named 'nltk'

My code is

import sagemaker  
from sagemaker.pytorch import PyTorch

JOB_PREFIX   = 'pyt-ic'
FRAMEWORK_VERSION = '1.3.1'

estimator = PyTorch(entry_point='finetune-T5.py',
                   source_dir='../src',
                   train_instance_type='ml.p2.xlarge' ,
                   train_instance_count=1,
                   role=sagemaker.get_execution_role(),
                   framework_version=FRAMEWORK_VERSION, 
                   debugger_hook_config=False,  
                   py_version='py3',
                   base_job_name=JOB_PREFIX)

estimator.fit()

finetune-T5.py have many other libraries that are not installed. How can I install the missing library? Or is there a better way to run the training job?

  • I tried adding nltk to requirements.txt file in scripts directory which worked for another module but not nltk; what could I be doing wrong?

已提问 4 年前1572 查看次数
1 回答
0
已接受的回答

Check out this link (Using third-party libraries section) on how to install third-party libraries for training jobs. You need to create requirement.txt file in the same directory as your training script to install other dependencies at runtime.

AWS
Sam
已回答 4 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则