AWS SageMaker - Extending Pre-built Container, Deploy Endpoint Failed. No such file or directory: 'serve'"

0

I am trying to deploy the SageMaker Inference Endpoint by extending the Pre-built image. However, it failed with "FileNotFoundError: [Errno 2] No such file or directory: 'serve'"

My Dockerfile

ARG REGION=us-west-2

# SageMaker PyTorch image
FROM 763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu116-ubuntu20.04-ec2

RUN apt-get update

ENV PATH="/opt/ml/code:${PATH}"

# this environment variable is used by the SageMaker PyTorch container to determine our user code directory.
ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/code

# /opt/ml and all subdirectories are utilized by SageMaker, use the /code subdirectory to store your user code.
COPY inference.py /opt/ml/code/inference.py

# Defines inference.py as script entrypoint 
ENV SAGEMAKER_PROGRAM inference.py

CloudWatch Log From /aws/sagemaker/Endpoints/mytestEndpoint

2022-09-30T04:47:09.178-07:00
Traceback (most recent call last):
  File "/usr/local/bin/dockerd-entrypoint.py", line 20, in <module>
    subprocess.check_call(shlex.split(' '.join(sys.argv[1:])))
  File "/opt/conda/lib/python3.8/subprocess.py", line 359, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/opt/conda/lib/python3.8/subprocess.py", line 340, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/opt/conda/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/opt/conda/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
Traceback (most recent call last): File "/usr/local/bin/dockerd-entrypoint.py", line 20, in <module> subprocess.check_call(shlex.split(' '.join(sys.argv[1:]))) File "/opt/conda/lib/python3.8/subprocess.py", line 359, in check_call retcode = call(*popenargs, **kwargs) File "/opt/conda/lib/python3.8/subprocess.py", line 340, in call with Popen(*popenargs, **kwargs) as p: File "/opt/conda/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/opt/conda/lib/python3.8/subprocess.py", line 1704, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename)

2022-09-30T04:47:13.409-07:00
FileNotFoundError: [Errno 2] No such file or directory: 'serve'
profile picture
質問済み 2年前677ビュー
2回答
1

Hi, @holopekochan!

The serve script is installed by SageMaker PyTorch Inference Toolkit when you pip-install it in the Dockerfile.

However, it's hard to say why it's not found in your container. Are you sure that you use the inference container, not training container, for your endpoint? If you go to the AWS Console > Amazon SageMaker > Models > your model, what ECR image it shows in Container 1 - Image?

It will be useful if you can share the code that you used to setup the SageMaker PyTorch estimator (if any) how you define your PyTorchModel and how you deploy() it.

profile pictureAWS
Ivan
回答済み 2年前
  • Thanks for your hint. I found the problem. I used wrong docker image, EC2 one instead of SageMaker. I simplify my question and provide the answer here. Just in case, if someone get the same problem.

  • Hi, @holopekochan! Good catch! I'm glad that you have found it!

0
承認された回答

Should use the Sagemaker image

763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu116-ubuntu20.04-sagemaker

instead of ec2

763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu116-ubuntu20.04-ec2
profile picture
回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ