Unable to launch remote sagemaker pipeline, error suggests python script not found in directory

0

I am trying to launch a sagemaker pipeline and running into an issue where the container cannot detect the py script that's being launched.

Basic set up:

  • A docker container thats been registered in AWS ECR.
  • A script calledpipeline_test.py that contains only `print("Hello World").
  • A launcher script in code editor in sagemaker studio (but this could just as well be a local visual studio editor) running code that launches pipeline_test.py (code found below).

Launcher script:

import json
import os
from sagemaker import image_uris
from sagemaker.processing import (
    Processor,
    ScriptProcessor,
    ProcessingInput,
    ProcessingOutput,
)
from sagemaker.session import Session

session = Session()
os.environ["IMAGE_URI"] = <value>
os.environ["ROLE"] = <value>
os.environ["BUCKET_NAME"] = <value>

# upload code to s3
code_input_path = (
    f"s3://{os.environ['BUCKET_NAME']}/pipeline_test/pipeline_test.py"
)

# output data in s3
data_output_path = f"s3://{os.environ['BUCKET_NAME']}/pipeline_test"

session.upload_data(
    bucket=os.environ["BUCKET_NAME"],
    key_prefix=f"pipeline_test",
    path="/home/sagemaker-user/data_science/pipeline_test.py",
)

# sagemaker container paths
container_base_path = "/opt/ml/processing"

def test_base_processor():

    # handle amazon sagemaker processing tasks
    processor = Processor(
        role=os.environ["ROLE"],
        image_uri=os.environ["IMAGE_URI"],
        instance_count=1,
        instance_type="ml.t3.medium",
        entrypoint=[
            "python",
            f"{container_base_path}/pipeline_test.py", # I also tried /input/pipeline_test.py
            "--processor=base-processor",
        ],
    )

    processor.run(
        job_name=f'processor-test-5',
        inputs=[
          ProcessingInput(
              source=code_input_path,
              destination=f"{container_base_path}/processor/input/",
        ),
    ],
        outputs=[
            ProcessingOutput(
                source=f"{container_base_path}/output/result",
                destination=f"{data_output_path}/result",
                output_name="test_result",
            ),
        ],
    )

test_base_processor()

Unfortunately, the pipeline fails and when I check the cloudwatch logs I see the following error:

python: can't open file '/opt/ml/processing/pipeline_test.py': [Errno 2] No such file or directory

This is the dockerfile:

# Use Python 3.10.14 base image
FROM --platform=linux/amd64 python:3.10.14 as build

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PYTHONIOENCODING=UTF-8 \
    LANG=C.UTF-8 \
    LC_ALL=C.UTF-8

# Install system dependencies
RUN apt-get update && \
    apt-get install -y \
    gcc \
    libpq-dev \
    libffi-dev

# Copy requirements.txt
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

Things I have checked:

  1. I can confirm the pipeline_test.py has successfully been pushed to s3
  2. I have tried many variations of the container_base_path adding /processing/directory, removing it, etc.
Cyrus
asked 13 days ago87 views
1 Answer
1
Accepted Answer

Issue was destination=f"{container_base_path}/processor/input/", needed to be destination=f"{container_base_path}/input/",

Cyrus
answered 13 days ago
profile picture
EXPERT
reviewed 13 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions