How to update cluster config when the original ebs snapshot is gone

0

Hi,
I have a cluster configured with ParallelCluster 2.10 that has been for over half a year now. It has two ebs resources mounted /shared and /install. It seems that both the ebs snapshots associated with the mounting points have been deleted. This should not be an issue, since the snapshots are used only for the initialization of the cluster, however, when I am trying to update the configuration of the cluster now - simply adding some compute nodes(bumping the max_queue_size), I am facing the following error message:
<code>
(venv_aws) > pcluster update flacscloudHPC-2-10-0 -c ./config_flacscloudHPC
Retrieving configuration from CloudFormation for cluster flacscloudHPC-2-10-0...
Validating configuration file ./config_flacscloudHPC...
WARNING: The configuration parameter 'scheduler' generated the following warnings:
The job scheduler you are using (torque) is scheduled to be deprecated in future releases of ParallelCluster. More information is available here: https://github.com/aws/aws-parallelcluster/wiki/Deprecation-of-SGE-and-Torque-in-ParallelCluster
ERROR: The section [ebs custom2] is wrongly configured
The snapshot snap-0870f8601759ca239 does not appear to exist: The snapshot 'snap-0870f8601759ca239' does not exist.
</code>
How can I update the max_queue_size without having the original snapshod 'snap-0870f8601759ca239'? Is it safe to forcefully reconfigure the cluster with some updated, existing snapshots?

preguntada hace 3 años318 visualizaciones
2 Respuestas
0
Respuesta aceptada

Hello mfolusiak1,
to be able to perform the update, please be sure the following bullets are verified in the cluster configuration:

  1. keep the ebs_snapshot_id set with the value of the deleted snapshot
  2. make sure that volume_size is also set. If it was not, please add it and make sure it reflects the size of the existing volume
  3. disable the sanity check, setting sanity_check to false

After that, you can perform the update with the pcluster update command

Edited by: luca-aws on Sep 7, 2021 5:40 AM

AWS
respondido hace 3 años
0

Thank you for your help, this worked!

respondido hace 3 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas