How to update cluster config when the original ebs snapshot is gone

0

Hi,
I have a cluster configured with ParallelCluster 2.10 that has been for over half a year now. It has two ebs resources mounted /shared and /install. It seems that both the ebs snapshots associated with the mounting points have been deleted. This should not be an issue, since the snapshots are used only for the initialization of the cluster, however, when I am trying to update the configuration of the cluster now - simply adding some compute nodes(bumping the max_queue_size), I am facing the following error message:
<code>
(venv_aws) > pcluster update flacscloudHPC-2-10-0 -c ./config_flacscloudHPC
Retrieving configuration from CloudFormation for cluster flacscloudHPC-2-10-0...
Validating configuration file ./config_flacscloudHPC...
WARNING: The configuration parameter 'scheduler' generated the following warnings:
The job scheduler you are using (torque) is scheduled to be deprecated in future releases of ParallelCluster. More information is available here: https://github.com/aws/aws-parallelcluster/wiki/Deprecation-of-SGE-and-Torque-in-ParallelCluster
ERROR: The section [ebs custom2] is wrongly configured
The snapshot snap-0870f8601759ca239 does not appear to exist: The snapshot 'snap-0870f8601759ca239' does not exist.
</code>
How can I update the max_queue_size without having the original snapshod 'snap-0870f8601759ca239'? Is it safe to forcefully reconfigure the cluster with some updated, existing snapshots?

asked 3 years ago231 views
2 Answers
0
Accepted Answer

Hello mfolusiak1,
to be able to perform the update, please be sure the following bullets are verified in the cluster configuration:

  1. keep the ebs_snapshot_id set with the value of the deleted snapshot
  2. make sure that volume_size is also set. If it was not, please add it and make sure it reflects the size of the existing volume
  3. disable the sanity check, setting sanity_check to false

After that, you can perform the update with the pcluster update command

Edited by: luca-aws on Sep 7, 2021 5:40 AM

AWS
answered 3 years ago
0

Thank you for your help, this worked!

answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions