AWS Backup Deleting VMware VM's when backups fail resulting in data loss

0

We are using AWS backups with AWS VMWARE SDDC, after about 60 days of use of the product and reaching ~150 VM's in the SDDC we started encountering an issue with AWS backups.

AWS Backups will initiate a snapshot, 10 seconds later the VM will shutdown and end up in a corrupt unrecoverable state.

The Disks will show 0 MB under settings in VMware, and the following error is displayed in Vcenter

Some of the disks of the virtual machine <Hostname> failed to load. The information present for them in the virtual machine configuration may be incomplete

following this, vmware HA attempts to failover:

"vSphere HA failover operation in progress in cluster Cluster-1 in datacenter SDDC-Datacenter: 0 VMs being restarted, 2 VMs waiting for a retry, 0 VMs waiting for resources, 3 inaccessible vSAN VMs"

but never successfully does and will hang with this error message for hours or days.

AWS backups displays the following error.

Failed to create backup during snapshot creation. Aborted backup job

Net result is we lose about 24+ hours of data since the last successful snapshot. VMware supports only advice is to restore the last backup from AWS, resulting in data-loss. This started with 1 VM and now we are up to 10, with a direct correlation between number of backups run and number of VM's lost.

asked a year ago1397 views
1 Answer
0

Hello, During the snapshot creation, the backup process utilizes the VMware API to create a snapshot. The above message indicates that this call failed. There are no calls during this process to reboot the VM. You can turn on the Backup Gateway and vCenter interactions by using step 5 here: https://docs.aws.amazon.com/aws-backup/latest/devguide/working-with-hypervisors.html#edit-hypervisor

You may need to involve VMware support for further investigation. The logs above will give you the interaction details.

AWS would be happy to debug further if you can create a support case with AWS and provide details of the Gateway. Thank you

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions