Custom labels programmatically

0

Hi

Is there a way to create a dataset with custom label images programmatically? I have a large number of images which have already been labeled so I don't want to do this again via the console.

All of the tutorials and demos use the console. I can't fin

Any help appreciated

Thanks

Edited by: chunt on Dec 30, 2019 5:14 AM

chunt
asked 4 years ago251 views
4 Answers
0
Accepted Answer

Hello,

It is possible to train a model programmatically without using the console. You can use the CreateProject and CreateProjectVersion APIs. CreateProjectVersion requires providing TrainingData and/or TestingData Asset data in the GroundTruthManifest (SageMaker Ground Truth output) format.

https://docs.aws.amazon.com/rekognition/latest/dg/API_CreateProject.html
https://docs.aws.amazon.com/rekognition/latest/dg/API_CreateProjectVersion.html
https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cd-manifest-files.html

AWS
answered 4 years ago
0

Hello,

I am trying to upload new training images to an already existing project version using .Net API. So far I uploaded the image via the S3 client to the assets folder of the bucket. But now I am stuck. According to the bucket folder structure the image appears at the same place as the other images, but it doesnt show up in the list of unlabeled images. Being new to all the manifests, I fear that I must add the new images to one of the manifests...

... or is there an API function I am missing or anything else that could help me achieving this?

answered 4 years ago
0

Hey @HamPeter,

The S3 Sub-Folder for a Dataset is not a live store. To add images to a dataset, you must also modify the corresponding Sagemaker Manifest File. You can think of this manifest file as the "source of truth" for all the annotations and contents of your dataset.

UI-Based

  1. Use the in-console GUI to add images to a dataset.

Within the custom labels console, there is a button allowing users to "Add Images". By selecting this button, you can upload images directly into a dataset in batches of 20 at a time. This feature ensures that your images are stored in the correct s3 bucket, automatically handles image name collisions, and updates your dataset directly in s3.

Programmatic

  1. Go to the S3 file containing the dataset manifest itself, and append a new "line" in the manifest corresponding to the new images.

You can do this with a simple append operation: cat additional_images.manifest >> output.manifest

After appending your new images, re-upload the output.manifest file to S3 to the same location (creating a new, updated version of that file in the console).

Though this approach is better with large numbers of images, you'll need to handle image-name collisions (dog_image.jpg -> dog_image.jpg), as well as ensuring the images you're adding are stored within the console bucket (due to the Training Service limitations of images from multiple buckets).

If you have any additional questions, please don't hesitate to reach out!

answered 4 years ago
0

That's great - thank you.

chunt
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions