- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
To update my own question:
I concluded that the job had somehow become corrupted, so I stopped it with the intention of starting a new one and seeing if the issue went away.
During the "stopping" process the progress counter updated to show that the images I'd labelled had actually been saved, but by then it was too late and the job was queued for termination.
I started a new job with the same settings (rebuilt the image manifest from scratch), and had the same issue with the initial 10 images being labelled and updated in the summary progress counter, but the images labelled in subsequent launches do not update the progress counter so there's no way of knowing how far long we are in the process and/or checking the labelled areas of any images already labelled.
Is this expected behaviour, or is this a bug?
Bump
Can anyone using Ground Truth for labelling shed any light on whether this lack of status update is normal?
How do we keep track of progress on a job?
BUMP AGAIN - I'll answer my own question incase anyone has the same issue.
It seems that Ground Truth just takes a very long time to update the progress of any labelling tasks. A job we created on Feb 23 has just shown progress on the Ground Truth console, so it seems to take about 5 days for any labelling work to be reflected in the console.
I have no idea why this is the case, and I've had no response from AWS to my query, but at least we can be confident that our work is being saved and the labels are being generated.
Hi jamesatfish,
I'm an AWS engineer working on SageMaker Ground Truth. I appreciate that you've been stuck with this for over a week, and I want to personally help make it better. It shouldn't take 5 days for a labeling task to be reflected in the console. We'd very much like to understand how you set up your labeling job so we can investigate it on our end. I sent you a private message on February 21st, but I'm replying here in the main thread in the hopes that you'll see this. Please feel free to send me a private message at your earliest convenience so we can discuss the particulars of your labeling job.
Thanks Jonathan!
Apologies for not seeing your reply or PM earlier, I don't seem to have received a notification email to let me know of the reply.
I have replied to your PM this morning with the ARN details, let me know if you need further background.
As an update to the thread, we spent a week labelling images in our most recently created task on the assumption that the updates to the task progress were just delayed. Unfortunately that was not the case, and we discovered this morning that Ground Truth has automatically cancelled our labelling job because "no progress had been made", causing us to lose most of the labelled images we had processed..
Hi jamesatfish,
I'm currently setting up a labeling job in Ground Truth and was hoping to avoid any pitfalls you came across through this process. Do you mind sharing what ended up being the resolutions to the issues you faced?
Thank you!
Naquent,
Here's a quick summary of what we learned:
-
Ground Truth jobs are split into image batches. The first batch size in any job is 10 images, but after that the default for each batch is 1000 images. You need to get through a batch in entirety before Ground Truth will prepare the next batch.
-
By default Ground Truth gives you 4 days to complete each batch. If you fail to complete the batch within 4 days the entire Ground Truth job will be failed and any progress lost.
-
You cannot control those 2 variables via the AWS Admin Console, but if you create the Ground Truth job via the CLI you can control both, up to a maximum of 10 days for each batch to be valid. You can only set these at the creation of the job, not once it has been created.
-
Using the above, you'll need to find a suitable batch size that's a compromise between ensuring you get the batch completed and not being delayed at the end of each batch. If you're working on the labelling tasks as a full time endeavour or with a dedicated team then set a large batch size so the team can label as many images as possible before waiting for the next batch to generate. If you're labelling on an ad-hoc basis then set a smaller batch size to ensure you get through the batch before the timeout.
-
Always set the maximum 10 day timeout for each batch, there appears to be no penalty for doing so.
-
AWS have changed the Ground Truth console output to more frequently update the progress of your labelling so you can see how many images have been labelled, but there's no way to see where you are up to in a batch or how long you have left to complete the current batch. If you've got a large labelling job you'll want to keep track of that yourself somehow.
I hope that helps!
James
Edited by: jamesatfish on Apr 9, 2019 8:42 AM
Thanks so much for sharing, James! This is really good information to have.
I am facing the same issue. I created the labeling job and after submitting almost 24+ hours of tagged images the status did not change. then when I stooped the labeling job it updated the progress. How do I resolve this?
Contenuto pertinente
- AWS UFFICIALEAggiornata 2 anni fa
- AWS UFFICIALEAggiornata 2 anni fa
- AWS UFFICIALEAggiornata 2 anni fa