ModuleNotFoundError when starting a training job on Sagemaker

0

I want to submit a training job on sagemaker. I tried it on notebook and it works. When I try the following I get ModuleNotFoundError: No module named 'nltk'

My code is

import sagemaker  
from sagemaker.pytorch import PyTorch

JOB_PREFIX   = 'pyt-ic'
FRAMEWORK_VERSION = '1.3.1'

estimator = PyTorch(entry_point='finetune-T5.py',
                   source_dir='../src',
                   train_instance_type='ml.p2.xlarge' ,
                   train_instance_count=1,
                   role=sagemaker.get_execution_role(),
                   framework_version=FRAMEWORK_VERSION, 
                   debugger_hook_config=False,  
                   py_version='py3',
                   base_job_name=JOB_PREFIX)

estimator.fit()

finetune-T5.py have many other libraries that are not installed. How can I install the missing library? Or is there a better way to run the training job?

  • I tried adding nltk to requirements.txt file in scripts directory which worked for another module but not nltk; what could I be doing wrong?

質問済み 4年前1572ビュー
1回答
0
承認された回答

Check out this link (Using third-party libraries section) on how to install third-party libraries for training jobs. You need to create requirement.txt file in the same directory as your training script to install other dependencies at runtime.

AWS
Sam
回答済み 4年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ