How to create a custom label dataset by feeding manifest programmatically

0

Hello,

From https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cd-create-dataset.html
I see that I can create a manifest without using SageMaker, as long as it conforms to the format specified here: https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cd-required-fields.html

But that Custom Labels Guide only shows that I can supply/specify my manifest by clicking on "Import image Labeled by SageMaker Ground Truth"

Is there a way to create or modify dataset and supply my manifest programmatically?

Thanks.

Edited by: mymingle on Mar 2, 2020 5:48 PM

posta 4 anni fa416 visualizzazioni
7 Risposte
0
Risposta accettata

Hey mymingle

The console is just one way to kick off your training. It provides an end to end experience to upload images, annotate them as required, train and view the results.

You can also create the manifest file by any other means. As long as it meets the format requirements the training will be able to consume it. You can find the format here - https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cd-manifest-files.html

Please note the difference in the format for classification (no bounding boxes) and object detection (one or more bounding boxes per image). From your previous posts it looks like you have bounding boxes. If that is the case, refer to the section "Required Fields for Object Detection" in the above link.

As you build your file, please ensure it also meets the validation rules specified here - https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cd-manifest-files-validation-rules.html

Once you have a manifest file ready, you can kick the training via CLI by:

  1. First creating a project
    https://docs.aws.amazon.com/rekognition/latest/dg/API_CreateProject.html
  2. Creating a project version under this project
    https://docs.aws.amazon.com/rekognition/latest/dg/API_CreateProjectVersion.html
    This API kicks of the actual training process.

You have correctly pointed to the doc link for doing the same thing via the SDK.

AWS
con risposta 4 anni fa
0

Hello,

Of course there is a way! ;)

I tried to do that 1 or 2 months ago and I've just written a blog post on this with OpenImages dataset as example.

You can check it here https://www.legiasquad.com/articles/retraining-aws-rekognition-with-custom-labels-on-an-annotated-dataset/

There is also a code sample that I used to generate this :)

I hope this helps!

con risposta 4 anni fa
0

Hi, I have a manifest built from my dataset. The ask is to be able to run a script so that bboxes are created from the manifest prior to training, and without needing to interact with any GUI, e.g. clicking on the radio button "Import images labeled by SageMaker Ground Truth". Thanks all the same.

con risposta 4 anni fa
0

Hello

If you have a manifest (with or without bounding boxes) in the prescribed format, you can kick off a training without the console by using the CLI via the create-project-version command:
https://docs.aws.amazon.com/cli/latest/reference/rekognition/create-project-version.html

You can also use the SDK to programmatically kick off a training using the CreateProjectVersion API: https://docs.aws.amazon.com/rekognition/latest/dg/API_CreateProjectVersion.html

Could you please elaborate on "script so that bboxes are created from the manifest prior to training"?

AWS
con risposta 4 anni fa
0

For now, I need to interact with any GUI, e.g. clicking on the radio button "Import images labeled by SageMaker Ground Truth" and specify the manifest link in bucket. Once I submit that through the GUI, I can see that bounding boxes are drawn on the image in the console based on the bbox coordinates in the manifest. Then I click on Train button, the training will start.

I want to avoid these manual steps using the GUI. If I can pass the manifest file and kick off the training using that , I will return here and mark this question as answered.

It seems it does, from reading this example: https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tm-sdk.html

thanks.

con risposta 4 anni fa
0

Following the SDK, using CreateProjectVersion, I can kick off training and implicitly creating the datasets by supplying two manifests, one for training and another for testing.

Also want to credit sdoloris. His websites provides a tutorial for creating the manifest that is helpful. Marking my question answered. Thank you.

Edited by: mymingle on Mar 17, 2020 4:03 PM

con risposta 4 anni fa
0

Great - please feel free to reach out if you have any other questions !

AWS
con risposta 4 anni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande