Skip to content

How do I configure AWS Backup for large Amazon S3 buckets with millions or billions of objects?

7 minute read
Content level: Advanced
1

I want to use AWS Backup to back up large Amazon Simple Storage Service (Amazon S3) buckets.

Short description

This article primarily serves as a best-practice guide to onboard very large S3 buckets into AWS Backup.

When you use AWS Backup to protect large S3 buckets containing millions or billions of objects, initial backup jobs might fail to complete within the default completion window if not configured properly. This can result in EXPIRED or PARTIAL recovery points, especially when backup plans contain multiple rules.

To configure AWS Backup for large S3 buckets, complete the following steps:

  1. Identify and remove unnecessary backup rules in your backup plan.
  2. Configure a continuous backup rule with appropriate frequency and retention settings.
  3. Adjust your backup completion window parameter to allow sufficient completion time.
  4. Monitor backup job progress and verify recovery point status.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Identify and remove unnecessary backup rules in your backup plan

For initial large S3 backups: Backup plans that contain multiple rules with similar or overlapping cron expressions—for example, one rule for continuous backups and one for snapshot-based backups—can result in EXPIRED and PARTIAL recovery points, especially if the completion window is too short.

To check the rules, complete the following steps:

  1. Open the AWS Backup console.
  2. In the navigation pane, choose Backup plans.
  3. Select your backup plan and review the Backup rules section.
  4. Check whether multiple rules share the same cron schedule expression, and more importantly the completion window of each rule.

If you find conflicting rules, remove the snapshot-based rule and keep only the continuous backup rule. Unless your requirement is to retain backups for greater than 35 days, a continuous rule alone is sufficient. Continuous backups support a retention period of 1–35 days and allow you to restore to any point in time within that retention range at second-level granularity. You can introduce a snapshot based rule later, after the initial continuous recovery point is created. The goal is to successfully create the first continuous recovery point so that later subsequent snapshot based recovery points will be taken based off of this existing continuous recovery point, allowing a smoother backup process.

Note: If your retention requirement exceeds 35 days, you can introduce a separate snapshot-based rule after the initial continuous recovery point has successfully completed.

Configure a continuous backup rule with appropriate frequency and retention settings

For large S3 buckets, use a single continuous backup rule. The frequency setting on a continuous rule doesn't affect how often the S3 bucket is backed up—continuous backups always run at 5-minute intervals. Instead, the frequency resets the retention clock on the continuous recovery point. This keeps the recovery point from expiring before it's completed.

Setting the frequency:

Set the frequency so that it passes at least twice within your retention window. This prevents the continuous recovery point from being lifecycled (expired) before the next job completes, which would cause a job failure and force a full re-sync of the S3 bucket which could have cost implications.

Example 1: If you set a continuous rule with a daily frequency and 7-day retention, the frequency passes at least 6 times before the retention window elapses.

Example 2: If you set a continuous rule with a weekly frequency and 15-day retention, the frequency passes at least twice during the 15-day period (every 7 days).

To determine your frequency, start with your retention requirement, and then set the frequency to pass at least twice within that retention period.

To configure the rule, complete the following steps:

  1. Open the AWS Backup console.
  2. In the navigation pane, choose Backup plans, and then select your backup plan.
  3. Choose Add backup rule.
  4. Set Backup frequency to Daily or Weekly or however, based on your retention requirement.
  5. Select the Enable continuous backups for supported resources checkbox.
  6. Set the Complete within window to 30 days
  7. Set the Retention period to your desired value (1–35 days).
  8. Choose Save rule.

Important: Don't change the retention setting on the rule after the initial continuous recovery point is created. Changing retention forces a full re-sync of the bucket, even if the initial continuous recovery point already completed successfully. This is why it's important to know what your retention requirement is before beginning this process.

Adjust your backup completion window parameter to allow sufficient completion time

Default backup completion window settings might not provide enough time for initial backups of large S3 buckets. Adjust the backup completion window parameter to allow sufficient completion time:

  • Start within: Set to 1 hour.
  • Complete within: Set to 30 days for very large buckets (millions/billions of objects or petabyte-scale data).

For reference, a bucket with approximately 5 billion objects and 6 PB (petabytes) of data can take roughly 100 hours (4 days) for the initial backup. For more information on estimated backup times, see Amazon S3 backups. This is not guaranteed however, and we do not offer an SLA on job completion times. Setting a large completion window does not cost anything, and will allow enough time for the initial job to complete. After the initial job is complete, you can re-adjust as needed.

Monitor backup job progress and verify recovery point status

Complete the following steps:

  1. Open the AWS Backup console.
  2. In the navigation pane, under My account, choose Jobs.
  3. Check the status of your backup job. You see one of the following statuses:
    • A Completed status indicates that the backup finished successfully.
    • A Running status with a percentage indicates the backup is still in progress. The first continuous recovery point requires a full bucket scan and takes longer than subsequent backups.
    • A Partial status indicates that a previous job expired before completion, typically caused by conflicting rules or an insufficient completion window.
    • An Expired status means the job didn't finish within the completion window.

Note: After the initial continuous recovery point completes, subsequent backups are significantly faster because they only listen for changes rather than scanning the entire bucket.

Important: Make sure that any on-demand backup, or snapshot based backups for the S3 bucket are stored in the same backup vault that your backup plan targets. This ensures that subsequent backups are incremental, and will recognize the continuous recovery point.

Additional cost considerations

  1. Larger buckets that do not change frequently can benefit from continuous backups, since this can result in lower costs when scans of the whole bucket along with multiple requests per objects don't need to be performed on pre-existing objects (objects that are unchanged from the previous backup).

  2. Buckets that contain more than 100 million objects and that have a small delete rate compared to the overall backup size might realize cost benefits with a backup plan that contains both a continuous backup with a retention period of 2 days along with snapshots of a longer retention.

Note: Please see the link below titled, "Best practices and cost considerations for S3 backups" for more cost optimization techniques.


Related information

AWS
EXPERT
published 2 months ago271 views