Skip to content

On demand S3 batch operation using S3 manifest generator

5 minute read
Content level: Advanced
2

S3 Batch Operations using manifest generator to perform on-demand, targeted operations across your S3 objects without waiting for inventory reports or manually creating manifest files.

Amazon S3 Batch Operations enables processing up to 20 billion objects with a single request, traditional approaches often require waiting up to 48 hours for first inventory reports before taking action. The new S3 Batch Operations manifest generator feature transforms this landscape by enabling immediate, targeted operations across S3 objects through dynamic filtering—eliminating wait time and streamlining essential tasks like disaster recovery, compliance management, and data lifecycle operations.

With Amazon S3 manifest generator feature, you can eliminate need to wait for inventory reports. This capability is available through both the AWS CLI/SDKs and the S3 Console, allowing you to create and execute batch jobs immediately with dynamic filtering.

Using the S3 console for on demand S3 batch operations

The S3 Console provides an intuitive interface for creating batch jobs with the manifest generator. Here's the detailed process:

Step 1: Job Setup and Scope Definition

  1. Navigate to the S3 service in the AWS Management Console
  2. Select Batch Operations from the left navigation pane
  3. Click Create job and choose your desired AWS Region
  4. Under Manifest, select "Generate an object list using filters" instead of using a pre-existing inventory report
  5. Specify your source bucket (e.g., s3://your-source-bucket)
  6. Apply object filters based on your criteria:
    • Prefix filters: Target specific directories (e.g., "2024/jan-24/")
    • Storage class filters: Focus on specific storage tiers
    • Size filters: Process objects within certain size ranges
    • Date filters: Target objects created within specific timeframes

Step 2: Operation Configuration

  1. Select your desired operation (Copy, Restore, Tag, Delete, etc.)
  2. Configure operation-specific settings:
    • For Copy operations: Set destination bucket, storage class, and encryption options
    • For Restore operations: Define expiration days and retrieval tier
    • For Tagging operations: Specify tag keys and values
    • For Object lock retention: Choose retention mode and days to retain
  3. Choose between "Use API default settings" for standard operations or "Specify settings" for custom configurations

Step 3: Job Management Settings

  1. Provide a descriptive job name and description for tracking
  2. Set job priority (1-10, with higher numbers indicating higher priority)
  3. Choose execution preference:
    • "Run job automatically" for immediate execution after creation
    • "Review before running" to verify configuration before execution
  4. Configure manifest output location for audit purposes
  5. Set up completion reporting (failed tasks only or all tasks)
  6. Select an appropriate IAM role with necessary permissions

Step 4: Review and Execute

  1. Review all configuration details in the summary screen
  2. Click "Create job" to initialize the batch operation
  3. If you selected "Review before running," manually trigger execution by selecting "Run job"

This approach transforms what was previously a 24-48 hour wait time (for first inventory) into an immediate operational response, making S3 Batch Operations suitable for time-critical scenarios while maintaining the same enterprise-scale processing capabilities.

Using CLI for on demand S3 batch operations

Creating S3 Batch Operations jobs with manifest generator through AWS CLI, it's best to use a JSON input file instead of typing out numerous parameters directly. Here's how to do it:

  1. First, generate a template JSON file using:
aws s3control create-job --generate-cli-skeleton
  1. The resulting JSON file contains all possible configuration fields. You'll need to:
  • Keep only the operation type you plan to use
  • Choose between using either a manifest file OR manifest generator (not both)
  • Fill in required fields and any optional ones you need
  • Remove unused fields
  1. Once your JSON file is ready, create the batch job using:
aws s3control create-job --cli-input-json file://FILE_NAME

Key Advantages of the Manifest Generator

  • Immediate execution: No waiting for inventory report generation
  • Dynamic filtering: Real-time object selection based on current bucket state
  • Precise targeting: Multiple filter criteria can be combined for exact object selection

Use Case Example

For a disaster recovery scenario where you need to restore database backups stored in Glacier Deep Archive:

  1. Use storage class filter to target only "GLACIER" or "DEEP_ARCHIVE" objects
  2. Apply prefix filter to focus on backup directories (e.g., "database-backups/2024/")
  3. Set restore operation with appropriate retrieval tier and expiration
  4. Execute immediately without waiting for inventory reports

Important considerations:

  • S3 Object version: S3 Batch Operations manifest generator works for the latest object versions only. In case of batch replication, it will pick the older version of objects too meeting the filter criteria.
  • Object Count limits: S3 Batch Operations jobs by default can process up to 4 billion objects for all operations. Specifically Copy, Object Tagging, Object Lock, invoking an AWS Lambda function, and Batch Replication jobs can support up to 20 billion objects
  • Manifest generation: Amazon S3 Batch Operations does not support cross-region manifest generation, refer Batch Operations job documentation

Documentation:

AWS
EXPERT
published a month ago207 views