Pipeline training step's custom output

0

A pipeline train step saves a custom json file in the output path, set in the estimator's output_path param, as seen below:

estimator = TensorFlow(
        entry_point=code_entry,
        source_dir=code_dir,
        output_path='s3://some-bucket/results/',
        [... other params...]
    )

It seems there is no step property to access custom output files, unlike we have for model artificats: step.properties.ModelArtifacts.S3ModelArtifacts.

Also, unlike ProcessingStep, there is no outputs argument on TrainingStep that would allow other steps to access the S3 URI from the arguments e.g., something similar to this:

step.arguments["ProcessingOutputConfig"]["Outputs"][0]["S3Output"]["S3Uri"].

How can one access the custom outputs of a training step, stored in the output.tar.gz?

  • Running into the same issue and found your question. Did you find solution for the above by any chance?

2 回答
0

There should be a way to achieve this. The properties attribute just mimics the response of Describe* API call. In DescribeTrainingJob API, we have OutputDataConfig field as in https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTrainingJob.html#API_DescribeTrainingJob_ResponseSyntax

So should be something like step.properties.OutputDataConfig.S3OutputPath

AWS
已回答 1 年前
  • Having the same problem and tried setting an input to my evaluation job with source "step.properties.OutputDataConfig.S3OutputPath", but it doesn't evaluate to correct s3 path. Any other ideas?

-1

Hi,

You will need to provide the output path in the EstimatorBase of the training step.

Examples

   estimator_example = Estimator(
        base_job_name="example",
        role=role_arn,
        instance_count=1,
        output_path=f"s3://example-bucket/my-model",
        environment={"region": region.name, "scope": scope_parameter},
        output_kms_key=kms_key_arn
    )

    training_step = TrainingStep(
        name="ExampleModelTraining",
        estimator=estimator_example,
        inputs={
            "training_data": TrainingInput(
                s3_data=transform_step.properties.ProcessingOutputConfig.Outputs[
                    transform_training_data_output_name
                ].S3Output.S3Uri
            ),
            "training_target": TrainingInput(
                s3_data=transform_step.properties.ProcessingOutputConfig.Outputs[
                    transform_training_target_output_name
                ].S3Output.S3Uri
            ),
        },
    )

Thanks,

AWS
Jady
已回答 1 年前
  • Hi! As you can see in my question, Im already providing the output_path in my estimator. The question is: how can other pipeline steps access the custom outputs generated by the training step?

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容