Error when saving custom metrics in SageMaker Experiments through SageMaker Pipelines Training Job

0

IHAC that I am working on enabling sagemaker experiments through a training job using SageMaker Pipelines. The below is the logic inserted into the train script which was working fine a few days ago tracking custom metrics into the trial component created by SageMaker Pipelines.

    try:
        print('>>> Loading an existing trial component')
        my_tracker = Tracker.load()
        
    except ValueError:
        print('>>> Creating a new trial component')
        my_tracker = Tracker.create()
        
    my_tracker.log_metric("mse:mse error", mean_squared_error(valid_y, preds))
    
    my_tracker.close()

However, since yesterday I am facing an error with running the same code with the following error:

Loading an existing trial component Traceback (most recent call last): File "training.py", line 82, in <module> my_tracker = Tracker.load() File "/miniconda3/lib/python3.7/site-packages/smexperiments/tracker.py", line 161, in load _ArtifactUploader(tc.trial_component_name, artifact_bucket, artifact_prefix, boto3_session), AttributeError: 'NoneType' object has no attribute 'trial_component_name'

I tried to change the versions of sagemaker and sagemaker-experiments to an older version but still see the same issue. This code works when I trigger just the training job out of SageMaker Pipelines but shows the above error when running through SageMaker Pipelines. Any pointers how to fix this?

AWS
已提問 1 年前檢視次數 456 次
1 個回答
0

SageMaker Python SDK is using Boto3 as the backend. You may also want to roll back & pin the Boto3 version.

AWS
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南