Pipeline training step's custom output

0

A pipeline train step saves a custom json file in the output path, set in the estimator's output_path param, as seen below:

estimator = TensorFlow(
        entry_point=code_entry,
        source_dir=code_dir,
        output_path='s3://some-bucket/results/',
        [... other params...]
    )

It seems there is no step property to access custom output files, unlike we have for model artificats: step.properties.ModelArtifacts.S3ModelArtifacts.

Also, unlike ProcessingStep, there is no outputs argument on TrainingStep that would allow other steps to access the S3 URI from the arguments e.g., something similar to this:

step.arguments["ProcessingOutputConfig"]["Outputs"][0]["S3Output"]["S3Uri"].

How can one access the custom outputs of a training step, stored in the output.tar.gz?

  • Running into the same issue and found your question. Did you find solution for the above by any chance?

2 Antworten
0

There should be a way to achieve this. The properties attribute just mimics the response of Describe* API call. In DescribeTrainingJob API, we have OutputDataConfig field as in https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTrainingJob.html#API_DescribeTrainingJob_ResponseSyntax

So should be something like step.properties.OutputDataConfig.S3OutputPath

AWS
beantwortet vor einem Jahr
  • Having the same problem and tried setting an input to my evaluation job with source "step.properties.OutputDataConfig.S3OutputPath", but it doesn't evaluate to correct s3 path. Any other ideas?

-1

Hi,

You will need to provide the output path in the EstimatorBase of the training step.

Examples

   estimator_example = Estimator(
        base_job_name="example",
        role=role_arn,
        instance_count=1,
        output_path=f"s3://example-bucket/my-model",
        environment={"region": region.name, "scope": scope_parameter},
        output_kms_key=kms_key_arn
    )

    training_step = TrainingStep(
        name="ExampleModelTraining",
        estimator=estimator_example,
        inputs={
            "training_data": TrainingInput(
                s3_data=transform_step.properties.ProcessingOutputConfig.Outputs[
                    transform_training_data_output_name
                ].S3Output.S3Uri
            ),
            "training_target": TrainingInput(
                s3_data=transform_step.properties.ProcessingOutputConfig.Outputs[
                    transform_training_target_output_name
                ].S3Output.S3Uri
            ),
        },
    )

Thanks,

AWS
Jady
beantwortet vor einem Jahr
  • Hi! As you can see in my question, Im already providing the output_path in my estimator. The question is: how can other pipeline steps access the custom outputs generated by the training step?

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen