Unable to launch remote sagemaker pipeline, error suggests python script not found in directory

0

I am trying to launch a sagemaker pipeline and running into an issue where the container cannot detect the py script that's being launched.

Basic set up:

  • A docker container thats been registered in AWS ECR.
  • A script calledpipeline_test.py that contains only `print("Hello World").
  • A launcher script in code editor in sagemaker studio (but this could just as well be a local visual studio editor) running code that launches pipeline_test.py (code found below).

Launcher script:

import json
import os
from sagemaker import image_uris
from sagemaker.processing import (
    Processor,
    ScriptProcessor,
    ProcessingInput,
    ProcessingOutput,
)
from sagemaker.session import Session

session = Session()
os.environ["IMAGE_URI"] = <value>
os.environ["ROLE"] = <value>
os.environ["BUCKET_NAME"] = <value>

# upload code to s3
code_input_path = (
    f"s3://{os.environ['BUCKET_NAME']}/pipeline_test/pipeline_test.py"
)

# output data in s3
data_output_path = f"s3://{os.environ['BUCKET_NAME']}/pipeline_test"

session.upload_data(
    bucket=os.environ["BUCKET_NAME"],
    key_prefix=f"pipeline_test",
    path="/home/sagemaker-user/data_science/pipeline_test.py",
)

# sagemaker container paths
container_base_path = "/opt/ml/processing"

def test_base_processor():

    # handle amazon sagemaker processing tasks
    processor = Processor(
        role=os.environ["ROLE"],
        image_uri=os.environ["IMAGE_URI"],
        instance_count=1,
        instance_type="ml.t3.medium",
        entrypoint=[
            "python",
            f"{container_base_path}/pipeline_test.py", # I also tried /input/pipeline_test.py
            "--processor=base-processor",
        ],
    )

    processor.run(
        job_name=f'processor-test-5',
        inputs=[
          ProcessingInput(
              source=code_input_path,
              destination=f"{container_base_path}/processor/input/",
        ),
    ],
        outputs=[
            ProcessingOutput(
                source=f"{container_base_path}/output/result",
                destination=f"{data_output_path}/result",
                output_name="test_result",
            ),
        ],
    )

test_base_processor()

Unfortunately, the pipeline fails and when I check the cloudwatch logs I see the following error:

python: can't open file '/opt/ml/processing/pipeline_test.py': [Errno 2] No such file or directory

This is the dockerfile:

# Use Python 3.10.14 base image
FROM --platform=linux/amd64 python:3.10.14 as build

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PYTHONIOENCODING=UTF-8 \
    LANG=C.UTF-8 \
    LC_ALL=C.UTF-8

# Install system dependencies
RUN apt-get update && \
    apt-get install -y \
    gcc \
    libpq-dev \
    libffi-dev

# Copy requirements.txt
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

Things I have checked:

  1. I can confirm the pipeline_test.py has successfully been pushed to s3
  2. I have tried many variations of the container_base_path adding /processing/directory, removing it, etc.
1 Risposta
1
Risposta accettata

Issue was destination=f"{container_base_path}/processor/input/", needed to be destination=f"{container_base_path}/input/",

Cyrus
con risposta un mese fa
profile picture
ESPERTO
verificato un mese fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande