IHAC that I am working on enabling sagemaker experiments through a training job using SageMaker Pipelines. The below is the logic inserted into the train script which was working fine a few days ago tracking custom metrics into the trial component created by SageMaker Pipelines.
try:
print('>>> Loading an existing trial component')
my_tracker = Tracker.load()
except ValueError:
print('>>> Creating a new trial component')
my_tracker = Tracker.create()
my_tracker.log_metric("mse:mse error", mean_squared_error(valid_y, preds))
my_tracker.close()
However, since yesterday I am facing an error with running the same code with the following error:
Loading an existing trial component
Traceback (most recent call last):
File "training.py", line 82, in <module>
my_tracker = Tracker.load()
File "/miniconda3/lib/python3.7/site-packages/smexperiments/tracker.py", line 161, in load
_ArtifactUploader(tc.trial_component_name, artifact_bucket, artifact_prefix, boto3_session),
AttributeError: 'NoneType' object has no attribute 'trial_component_name'
I tried to change the versions of sagemaker and sagemaker-experiments to an older version but still see the same issue. This code works when I trigger just the training job out of SageMaker Pipelines but shows the above error when running through SageMaker Pipelines. Any pointers how to fix this?