Unable to launch remote sagemaker pipeline, error suggests python script not found in directory

0

I am trying to launch a sagemaker pipeline and running into an issue where the container cannot detect the py script that's being launched.

Basic set up:

  • A docker container thats been registered in AWS ECR.
  • A script calledpipeline_test.py that contains only `print("Hello World").
  • A launcher script in code editor in sagemaker studio (but this could just as well be a local visual studio editor) running code that launches pipeline_test.py (code found below).

Launcher script:

import json
import os
from sagemaker import image_uris
from sagemaker.processing import (
    Processor,
    ScriptProcessor,
    ProcessingInput,
    ProcessingOutput,
)
from sagemaker.session import Session

session = Session()
os.environ["IMAGE_URI"] = <value>
os.environ["ROLE"] = <value>
os.environ["BUCKET_NAME"] = <value>

# upload code to s3
code_input_path = (
    f"s3://{os.environ['BUCKET_NAME']}/pipeline_test/pipeline_test.py"
)

# output data in s3
data_output_path = f"s3://{os.environ['BUCKET_NAME']}/pipeline_test"

session.upload_data(
    bucket=os.environ["BUCKET_NAME"],
    key_prefix=f"pipeline_test",
    path="/home/sagemaker-user/data_science/pipeline_test.py",
)

# sagemaker container paths
container_base_path = "/opt/ml/processing"

def test_base_processor():

    # handle amazon sagemaker processing tasks
    processor = Processor(
        role=os.environ["ROLE"],
        image_uri=os.environ["IMAGE_URI"],
        instance_count=1,
        instance_type="ml.t3.medium",
        entrypoint=[
            "python",
            f"{container_base_path}/pipeline_test.py", # I also tried /input/pipeline_test.py
            "--processor=base-processor",
        ],
    )

    processor.run(
        job_name=f'processor-test-5',
        inputs=[
          ProcessingInput(
              source=code_input_path,
              destination=f"{container_base_path}/processor/input/",
        ),
    ],
        outputs=[
            ProcessingOutput(
                source=f"{container_base_path}/output/result",
                destination=f"{data_output_path}/result",
                output_name="test_result",
            ),
        ],
    )

test_base_processor()

Unfortunately, the pipeline fails and when I check the cloudwatch logs I see the following error:

python: can't open file '/opt/ml/processing/pipeline_test.py': [Errno 2] No such file or directory

This is the dockerfile:

# Use Python 3.10.14 base image
FROM --platform=linux/amd64 python:3.10.14 as build

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PYTHONIOENCODING=UTF-8 \
    LANG=C.UTF-8 \
    LC_ALL=C.UTF-8

# Install system dependencies
RUN apt-get update && \
    apt-get install -y \
    gcc \
    libpq-dev \
    libffi-dev

# Copy requirements.txt
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

Things I have checked:

  1. I can confirm the pipeline_test.py has successfully been pushed to s3
  2. I have tried many variations of the container_base_path adding /processing/directory, removing it, etc.
Cyrus
gefragt vor einem Monat107 Aufrufe
1 Antwort
1
Akzeptierte Antwort

Issue was destination=f"{container_base_path}/processor/input/", needed to be destination=f"{container_base_path}/input/",

Cyrus
beantwortet vor einem Monat
profile picture
EXPERTE
überprüft vor einem Monat

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen