Ensuring Equal Distribution of Tasks Among Workers in AWS Ground Truth Private Workforce

0

Description:

I am using AWS Ground Truth for an annotation task involving bounding boxes. The dataset consists of Y images, and I have X workers in my private workforce. I observed the task distribution is first-come, first-served (FCFS), meaning that a single worker can complete all images, preventing fair distribution among all available workers. Number of workers per dataset object attribute is set to 1.

What I Want to Achieve:

I need to distribute Y images equally among X workers, ensuring that each worker annotates only a subset of images. For example:

  • If 10 images need to be annotated and 2 workers are available, each worker should get 5 images.
  • No single worker should complete all images before others get a chance.

Issues Faced:

  1. FCFS Problem:
    • Currently, one active worker can take all the images, leaving nothing for others.
    • No built-in mechanism ensures that tasks are fairly divided.

Potential Solutions I Am Exploring:

  1. Manually Creating Worker-Specific Manifests
    • Instead of a single manifest file, I create separate manifests for each worker and launch individual labeling jobs.
    • Question: Is this the best approach for private workforces?

Key Questions:

  • What is the best way to ensure equal task distribution in AWS Ground Truth?
  • Does AWS GT support auto-balancing tasks among workers?
  • Can I use a Lambda function to filter tasks per worker dynamically?
  • Are there alternative approaches recommended for private workforces?
1 Answer
0

Hi, broadly you're correct: There's no built-in mechanism to limit the number or percentage of tasks each user can complete within a job as far as I'm aware... So if you want this behavior you'd need to create separate labeling jobs and assign them to single-user "teams".

You could automate this process of splitting manifests and creating/managing separate teams through Lambda, EventBridge, and similar integrations... But overall, I'd say it's a pretty unusual pattern and not one I'd particularly recommend?

Perhaps you could focus on recognition rather than hard limits: In your output manifest files I believe you should see a labeler ID, which won't be the email address of the user, but I tentatively think might be an ID you could look up against the Amazon Cognito User Pool that Ground Truth creates in your account? If you can resolve the user IDs from the output manifests to the individuals in your organization, you could produce and share leaderboards to recognize how some people contributed more to the effort than others?

This would help resolve the tension between enforcing fairness, versus prioritizing getting the work completed by whoever's available to help out.

AWS
EXPERT
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions