Why does my AWS Backup job fail?

6 minute read
1

I want to take a backup of my resources using AWS Backup but the job fails with a FAILED status.

Short description

When backing up your AWS resources, AWS Backup creates a backup job that provides details about the backup process. A backup job in the FAILED status indicates that a backup was attempted, but it wasn't successful. You can check the status and details of a backup job by using either the AWS Backup console or the describe-backup-job API call.

If no backup job is triggered for your resource, then see Why are my scheduled backup plans in AWS Backup not running?

Before troubleshooting, be aware of the following:

  • AWS Backup doesn't retry a backup job that failed. The FAILED status indicates that the attempt wasn't successful, and no backup job was created. A FAILED backup job can't be retried.
  • If backup jobs are triggered by a scheduled backup plan, then a new backup job is created for the resource during the next scheduled runtime. If backup jobs are triggered manually or on-demand, then you must make a new StartBackupJob request to take a backup of the resource.

Resolution

Check the status message of the FAILED job

Failed backup jobs have a corresponding status message. The status message provides information about why the backup job failed. This can help you troubleshoot the issue.

Resolve "You are not authorized to perform this operation" errors

You might receive an error similar to one of the following:

  • "You are not authorized to perform this operation"
  • "Backup job failed because of insufficient privileges"
  • Any other permissions related errors.

AWS Backup assumes an AWS Identity and Access Management (IAM) role to perform backups. The IAM role is provided within an on-demand backup job. For scheduled backups, the IAM role is configured in the resource assignments of the backup plan.

This IAM role must have the trust relationship with AWS Backup, and the policies that grant permissions to perform the required backup operations.

Check that you have the following requirements in place for the IAM role you're using to create backups:

  • The IAM role must have the AWS Backup Service listed as a trusted entity. This allows AWS Backup to assume the role.
  • You must have access permissions to take a backup of the resource. Each of the AWS services that AWS Backup supports require different permissions. Confirm that these permissions are granted.
  • AWS Key Management Service (AWS KMS) permissions must be added to the backup role. Confirm that the AWS KMS key policy has Principal arn:aws:iam::111122223333:root. Without this permission, IAM policies that allow access to the key are ineffective. IAM policies that deny access to the key are still effective without this permission.

Note: You might be passing the default role that's created by AWS Backup. In this case, the IAM role already has permissions granted through the AWS managed policies AWSBackupServiceRolePolicyForBackup and AWSBackupServiceRolePolicyForRestores.

AWS Backup can create two IAM roles with different permissions and use cases, AWSBackupDefaultServiceRole and AWSServiceRoleForBackup. Make sure that you're using the correct IAM role. When you create a backup plan or trigger a manual backup job, choose AWSBackupDefaultServiceRole. This is the default role.

Note: You might want to create an Amazon Simple Storage Service (Amazon S3) backup using AWSBackupDefaultServiceRole. Be aware that for Amazon S3, the default role doesn't contain the permissions you need to perform backup and restore. So, you must perform the one-time-permissions-setup to grant Amazon S3 access.

Resolve "Backup Job did not complete within completion window" error

You might receive an error similar to one of the following:

  • "Backup Job did not complete within completion window"
  • "An AWS Backup job failed to complete in time"

Backup jobs that fail with one of these error messages also show an EXPIRED status. This means that the job successfully initiated, but couldn't complete within the CompleteWithin time.

Optionally, you can define the backup window CompleteWithin time in your backup rule configuration. Be aware that the CompleteWithin parameter doesn't indicate that a job will complete successfully within the specified time. The CompleteWithin parameter sets the period of time that your backup must complete in. But, if the data transfer that backs up your resources doesn't complete during the CompleteWithin time, then AWS Backup stops the backup. Then, the EXPIRED status displays.

Set a longer time period for CompleteWithin to make sure that your backup jobs complete successfully.

Note: There is no set time that it takes for an AWS Backup job to complete.

Resolve "Backup job failed because the lifecycle is outside the valid range for backup vault" error

This error is thrown when the backup vault has a vault lock with MaxRetentionDays or MinRetentionDays. This configuration restricts the creation of backups in the vault with Retention that doesn't fall within the specified range.

To resolve this error, update the backup retention in your backup plan to fall within the range specified. Or, if applicable, update the vault lock configurations.

Resolve "Unsupported disk size detected during backup creation" errors

When performing a VMware backup, you might see one of the following errors:

  • "Unsupported disk size detected during backup creation. Aborted backup job"
  • "Failed to process backup data during backup data processing. Aborted backup job"

This error occurs for the following reasons:

  • Unsupported versions - AWS Backup supports backup and restore for VMware with specific versions and requirements. Make sure that your virtual machine (VM) is supported.
  • Backup gateway network connectivity - Make sure that all of the required ports are open for the gateway to connect to the host and back up the virtual machines. Then, confirm that there are no incorrect DNS servers configured on the gateway appliance.
  • Disk configuration - AWS Backup doesn't support the backup of VMs that have disks in independent persistent or independent non-persistent mode. List all of the disks attached to your VMs along with the disk mode. Check for any disks in either of these modes. Change the disk mode to dependent of all disks.
  • Disk size - Similarly, AWS Backup supports only VM virtual disk sizes that are a multiple of 512 KiB. Run fdisk-l from the VM that's failing the backup job for further insight.
AWS OFFICIAL
AWS OFFICIALUpdated 10 months ago