how to create a training step in sagemaker pipeline?

0

I have a following project structure. i clone my project in sagamaker studio and create a sagemaker pipeline (sample below) . in the processing step , i can pass processing input , where i can specify , my utils folder, where i have additional helper code (source="src/utils", ), which i suppose get copied to the sagemaker instance, i can use helper.py module in my processing.py. this set up works. I want to use similar construct, for my training step too. but in the documentation , i dont' see where i can pass similar inputs for the training step. how can achieve this?how can i specify in my training step that i want to copy additional code/folder to training instance and use those helper methods?

pipeline_project
      src
          processing.py
          train.py
      utils
           helper.py
from sagemaker.processing import ScriptProcessor, ProcessingInput, ProcessingOutput

script_processor = ScriptProcessor(command=['python3'],
                image_uri='image_uri',
                role='role_arn',
                instance_count=1,
                instance_type='ml.m5.xlarge', 
)

step_process = ProcessingStep(
        name="ProcessStep",
        processor=script_processor, 
        code = 'src/processing.py'
        input = [ 
                ProcessingInput(
                      input_name="utils"
                      source="src/utils", 
                      destination="/opt/ml/processing/input/src/utils",
                )
)
gefragt vor einem Jahr630 Aufrufe
1 Antwort
0

When you create a training step, you need to pass in an Estimator object as an argument to the training step. Notice the xgb_estimator object in the code below. You can pass in a source_dir argument to the estimator and add additional code dependencies at that location.

Create the Estimator

from sagemaker.xgboost.estimator import XGBoost

xgb_estimator = XGBoost(
    entry_point="abalone.py",
    source_dir="code",
    hyperparameters=hyperparameters,
    role=role,
    instance_count=1,
    instance_type="ml.m5.2xlarge",
    framework_version="1.0-1",
)

Provide the Estimator as an argument to the Training step

from sagemaker.workflow.pipeline_context import PipelineSession

from sagemaker.inputs import TrainingInput
from sagemaker.workflow.steps import TrainingStep

from sagemaker.xgboost.estimator import XGBoost

pipeline_session = PipelineSession()

xgb_estimator = XGBoost(..., sagemaker_session=pipeline_session)

step_args = xgb_estimator.fit(
    inputs={
        "train": TrainingInput(
            s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
                "train"
            ].S3Output.S3Uri,
            content_type="text/csv"
        ),
        "validation": TrainingInput(
            s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
                "validation"
            ].S3Output.S3Uri,
            content_type="text/csv"
        )
    }
)

step_train = TrainingStep(
    name="TrainAbaloneModel",
    step_args=step_args,
)
Ashish
beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen