Use GroundTruth bbox labels in a TensorFlow fine tuning job

0

I can use the manifest file created by Ground Truth ref as input to a training job for object detection (ref). Can also use it for Tensorflow jobs, like the one here? In all the tutorials I find, the data in annotations.json has a different format than the one in the GroundTruth output.

My goal is to use more ad-hoc models rather than just resnet and vgg, and get information from Tensorboard and such.

1 Answer
1
Accepted Answer

Hello Lorenzo,

In general, you can use Ground Truth augmented manifest to train a TensorFlow model with SageMaker using Pipe Mode. Augmented manifests can only support Pipe input mode.

So the specific example you asked here is not going to work using augmented manifest, because it's script mode. If you just meant to use the estimator, you can modify the code to something like this.

Please have a read on and follow this for more details.

AWS
SUPPORT ENGINEER
Jann_P
answered a year ago
profile picture
EXPERT
reviewed 23 days ago
  • Thanks Jann! What I would like to do is to train an object detector model using images annotated with Ground Truth. The model could be YOLO V5 or similar. I would like to have access to Tensorboard for metrics visualization. Do you have code examples to do that? Thanks!

  • Hello Lorenzo, Yes, you can follow the document below to set up the Tensorboard. Please note this will create an app in your domain so cost will occur.

    "Amazon SageMaker with TensorBoard runs the TensorBoard application on an ml.r5.large instance and incurs charges after the SageMaker free tier or the free trial period of the feature. For more information, see Amazon SageMaker Pricing."

    https://docs.aws.amazon.com/sagemaker/latest/dg/tensorboard-on-sagemaker.html

  • Thanks Jann. The problem seems to be that the Jumpstart models (e.g. in s3://jumpstart-cache-prod-us-east-1/source-directory-tarballs/tensorflow/transfer_learning/od1/v1.1.0/sourcedir.tar.gz) don't know how to open those annotation files it seems. I'm seeing the error:

    FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/annotations.json'

    The jumpstart code is still looking for the annotation.json file, and it probably wants to use that to load the annotations

  • In general, would the models in Jumpstart be able to load the images and annotations listed in the augmented manifest file produced by GroundTruth?

  • Unfortunately, it is not supported for Jumpstart models as they are expecting a different type of format. You may need to, which you probably have done, write a custom script to convert the format.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions