Does AWS Batch support EC2 Spot stop/start and hibernation

0

Does AWS Batch have support for EC2 spot stop/start and hibernation options to handle interruptions? The options are documented here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html

But I am unable to find reference of whether AWS Batch can support this. From my understanding, if a Spot instance is terminated, AWS Batch retries the job from the start.

AWS
preguntada hace 6 años536 visualizaciones
1 Respuesta
0
Respuesta aceptada

No, AWS Batch does not support EC2 spot stop/start and hibernation options with a managed compute environment (CE). If a Spot instance is terminated, AWS Batch will retry the job (from the start) per your retry strategy defined in your job definition. It is possible for your job to resume progress if it has checkpointing built-in and you save your data somewhere that can be accessed again (e.g. EFS).

You can use hibernation with unmanaged CE. However, since it is not officially supported, your job might be “stuck” when Spot reclaims your instance and the instance goes into hibernation. This job will continue to be in “running” state and stay at the top of your job queue, preventing other jobs from running if you have hit your max vCPU limit in your CE.

AWS
AWS-Rey
respondido hace 6 años
profile picture
EXPERTO
revisado hace un mes
  • hi, just checking in - is this answer still up-to-date 5 years later? we are wondering if hibernation-based spot instances can be used for our long-running batch jobs to reduce the cost :.)

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas