Pipeline is working well until RepackModel..AlgorithmError: framework error: Traceback (most recent call last): File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_trainer.py", line 84

0

ClientError: AlgorithmError: framework error: Traceback (most recent call last): File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_trainer.py", line 84, in train entrypoint() File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 39, in main train(environment.Environment()) File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 35, in train runner_type=runner.ProcessRunnerType) File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/entry_point.py", line 100, in run wait, capture_error File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/process.py", line 291, in run cwd=environment.code_dir, File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/process.py", line 208, in check_error info=extra_info, sagemaker_training.errors.ExecuteUserScriptError: ExecuteUserScriptError: ExitCode 1 ErrorMessage "" Command "/bin/sh -c ./_repack_script_launcher.sh --dependencies

ClientError: AlgorithmError: framework error: Traceback (most recent call last): File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_trainer.py", line 84, in train entrypoint() File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 39, in main train(environment.Environment()) File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 35, in train runner_type=runner.ProcessRunnerType) File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/entry_point.py", line 100, in run wait, capture_error File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/process.py", line 291, in run cwd=environment.code_dir, File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/process.py", line 208, in check_error info=extra_info, sagemaker_training.errors.ExecuteUserScriptError: ExecuteUserScriptError: ExitCode 1 ErrorMessage "" Command "/bin/sh -c ./_repack_script_launcher.sh --dependencies

JasonP
질문됨 6달 전250회 조회
2개 답변
1

🤔 You may need to investigate the code and configuration of the RepackModel step in your pipeline, as well as the dependencies and dependencies of the user script that is being executed. You may also need to check the input data and model parameters to ensure that they are correct and compatible with the algorithm and framework being used.

profile picture
전문가
답변함 6달 전
  • where can i find the script? the register-RepackModel is automatically created in the pipeline by sagemaker after check-model-accuracy?

  • From the error logs you've provided, the system is unable to find the _repack_script_launcher.sh script, which should be a part of the SageMaker training job. There's also an error indicating that the file model.tar.gz cannot be found in the expected directory, and there's a failure to parse a hyperparameter.

  • The model.tar.gz file is not found in the directory /opt/ml/input/data/training/. Try to check that the file is being generated correctly and is being placed in the correct directory.

  • Which OS are you using for your Sagemaker? Make sure you are using a supported OS

1

Here's the additional error I've checked on the sagemaker studio logs:

Enter image description here

Enter image description here

Enter image description here

Enter image description here

JasonP
답변함 6달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인