What is Validation set adjustment while the system is auto labeling


How is it works, and what is the purpose?

INFO:samurai_science_object_detection.cli:Running validation set adjustment.

1 Answer

When an auto labeling job is initiated by Ground Truth, a random sample of input data is selected and sent to Human workers for labeling. Upon the return of this data, a training set and a validation set are created. Ground Truth uses these datasets to train and validate the model used for auto labeling.

Much like with ML models, cross-validation is done by using a complementary subset of the data from the input data to evaluate the model. In Ground Truth auto labeling, this Validation set of data is periodically adjusted ( at every iteration of the labeling job) to improve the accuracy of the automated labels.

If you have further specific questions around your workflows or require a deep dive on your logs in this regard, you may open a support case using this link , as we may require details that are non-public information, and we will be happy to assist you further.

How it works - https://docs.aws.amazon.com/sagemaker/latest/dg/sms-automated-labeling.html#sms-automated-labeling-how-it-works

Cross-Validation - https://docs.aws.amazon.com/machine-learning/latest/dg/cross-validation.html

answered a month ago
  • The number of validation set was reducing at each round, and this missing data was moving to the traning set.

    On the 1st round of traning, there was 1000 traning set and 501 validation set. On the 2nd round of traning there was 2202 traning set and 299 validation set. On the 3rd round of traning there was 3381 traning set and 120 validation set.

    I am curious about the machanism behind this.

    Thank you!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions