Data copy using s3 batch

0

I have a requirement to transfer large volume of s3 data(10TB) one time from one AWS account to another and 1TB of data on demand(once/twice a month). This document(https://docs.aws.amazon.com/AmazonS3/latest/userguide/specify-batchjob-manifest-xaccount-inventory.html) talks about copy objects across AWS accounts but in this, recipient is pulling the data from source, however in my case, requirement is to have the data to be pushed from source to target. We can’t grant access to our bucket/prefixes to external vendors, who we need to send the data.

I’ve set up the permissions in exactly same manner but swapping the source permissions with destination and vice-versa. I tried to copy the objects from source to destination(using s3 inventory) by setting up batch operation job at source but it didn’t work.Batch job fails with error as ACL not supported. Any help/clue would be greatly appreciated.

Sibu
asked 5 days ago31 views
2 Answers
2
Accepted Answer

First of all, thank you for posting your question here.

You are right, this document suggests to create s3 batch copy job in destination account. I understand your use case, where source account might not want to allow access to its bucket to any external account and there can be reasons behind it. I’ve been into this exact situation so I am sharing the detailed steps to achieve this:

To have batch copy working in source-destination direction, destination bucket must have ACLs enabled as otherwise, batch operation will try to put ACL on objects while copying at destination bucket and would fail with following error:

“The bucket does not allow ACLs (Service: Amazon S3; Status Code: 400; Error Code: AccessControlListNotSupported;”

Refer following documentations around S3 object ACLs:

Pre-requisites:

  • Destination bucket has ACL enabled(Bucket owner preferred/object writer)
  • Both buckets are in same region
  • You know the destination account canonical id
  • Have an S3 bucket for storing manifest/inventory files and storing batch job completion reports

Assumption:

  • Both side of s3 buckets are encrypted using SSE-KMS CMK
  • No object is larger than 5GB
  • Individual who will create the batch job is logged in using IAM role -> This not a requirement but an assumption for following permissions setup, replace it with your AWS principal

Permissions setup:

Source Side:

  1. Manifest bucket policy to be updated:

     {
         "Sid": "S3BatchCopyInventory",
         "Effect": "Allow",
         "Principal": {
             "Service": "s3.amazonaws.com"
         },
         "Action": "s3:PutObject",
         "Resource": "arn:aws:s3:::<inventory_bucket_name>/*",
         "Condition": {
             "StringEquals": {
                 "aws:SourceAccount": "<source_account_number>",
                 "s3:x-amz-acl": "bucket-owner-full-control"
             },
             "ArnLike": {
                 "aws:SourceArn": "arn:aws:s3:::<source_bucket_name>"
             }
         }
     }
    
  2. Batch Operations Role to be created at source account:

    2.1. Permissions:

     {   
         "Version": "2012-10-17”,`
         "Statement": [
             {
                 "Action": [
                     "s3:PutObject",
                     "s3:PutObjectAcl",
                     "s3:PutObjectVersionAcl",
                     "s3:PutObjectVersionTagging",
                     "s3:PutObjectTagging"
                 ],
                 "Resource": [
                     "arn:aws:s3:::<target_bucket_name>/*"
                 ],
                 "Effect": "Allow",
                 "Sid": "S3BatchTarget"
             },
             {
                 "Action": [
                     "s3:GetObject",
                     "s3:GetObjectAcl",
                     "s3:GetObjectVersionAcl",
                     "s3:GetObjectVersion",
                     "s3:GetObjectVersionTagging",
                     "s3:GetObjectTagging",
                     "s3:GetBucketLocation"
                 ],
                 "Resource": [
                     "arn:aws:s3:::source_bucket_name",
                     "arn:aws:s3:::source_bucket_name/<optional_prefix>/*"
                  ],
                 "Effect": "Allow",
                 "Sid": "S3BatchSource"
             },
             {
                 "Action": [
                     "s3:GetObject",
                     "s3:PutObject",
                     "s3:GetObjectVersion",
                     "s3:GetBucketLocation"
                 ],
                 "Resource": [
                     "arn:aws:s3:::<inventory_bucket_name>",
                     "arn:aws:s3:::<inventory_bucket_name>/*"
                 ],
                 "Effect": "Allow",
                 "Sid": "S3BatchManifestReport"
             },
            {
                 “Action": [
                    "kms:Encrypt",
                    "kms:Decrypt",
                    "kms:ReEncrypt*",
                    "kms:GenerateDataKey*",
                    "kms:DescribeKey"
                 ],
                 "Resource": “Destination_S3_Bucket_KMS_Key_ARN”
                 "Effect": "Allow",
                 "Sid": "AllowUseOfExternalS3KMSKey
            },
            {
                 "Action": [
                     "kms:Encrypt",
                     "kms:Decrypt",
                     "kms:ReEncrypt*",
                     "kms:GenerateDataKey*",
                     "kms:DescribeKey"
                 ],
                 "Resource": “Source_S3_Bucket_KMS_Key_ARN”,
                 "Effect": "Allow”,
                 "Sid": "AllowUseOfLocalS3KMSKey"
            }
         ]
     }
    

    2.2. Batch Operations Role Trust Policy:

     {
        "Version":"2012-10-17",
        "Statement":[
           {
              "Effect":"Allow",
              "Principal":{
                 "Service":"batchoperations.s3.amazonaws.com"
              },
              "Action":"sts:AssumeRole"
           } 
        ]
     }
    

Target Side:

  1. Target bucket policy to be updated:

     {
         "Version": "2012-10-17",
         "Id": "S3-Batch-Copy-Policy",
         "Statement": [
             {
                 "Sid": "DenyAllUnlessApproved",
                 "Effect": "Deny",
                 "Principal": "*",
                 "Action": [
                     "s3:GetObject",
                     "s3:PutObject"
                 ],
                 "Resource": [
                     "arn:aws:s3:::<target_bucket_name>",
                     "arn:aws:s3:::<target_bucket_name>/<optional_prefix>/*"
                 ],
                 "Condition": {
                     "StringNotLike": {
                         "aws:PrincipalArn": [
                             "arn:aws:iam::<source_account_number>:role/<source_ac_s3_batch_copy_role>",
                             "arn:aws:iam::<source_account_number>:role/<source_acnt_iam_role>"
                         ]
                     }
                 }
             },
             {
                 "Sid": "AllowBatchCopyRole",
                 "Effect": "Allow",
                 "Principal": {
                     "AWS": "arn:aws:iam::<source_account_number>:role/<source_ac_s3_batch_copy_role>"
                 },
                 "Action": [
                     "s3:PutObject",
                     "s3:PutObjectAcl",
                     "s3:PutObjectTagging"
                 ],
                 "Resource": "<target_bucket_name>/<optional_prefix>/*"
             },
             {
                 "Sid": "AllowConsoleBatchJobCreation",
                 "Effect": "Allow",
                 "Principal": {
                     "AWS": [
                         arn:aws:iam::<source_account_number>:role/<source_acnt_iam_role>
                     ]
                 },
                 "Action": [
                     "s3:Get*",
                     "s3:List*"
                 ],
                 "Resource": [
                     "arn:aws:s3:::<target_bucket_name>",
                     "arn:aws:s3:::<target_bucket_name>/<optional_prefix>/*"
                 ]
             }
         ]
     }
    
  2. Target bucket KMS key policy to be updated:

    {
         "Sid": "Cross account KMS Key access",
         "Effect": "Allow",
         "Principal": {
             "AWS": "arn:aws:iam::<source_account_number>:role/<source_ac_s3_batch_copy_role>"
         },
         "Action": [
             "kms:Encrypt",
             "kms:Decrypt",
             "kms:ReEncrypt*",
             "kms:GenerateDataKey*",
             "kms:DescribeKey"
         ],
         "Resource": “Destination_S3_Bucket_KMS_Key_ARN”
     }
    

Create Batch copy job:

  1. Login to source account, go to S3 console and click on Batch operations -> Create job
  2. Choose s3 inventory report option(if you are using inventory file)
  3. Provide s3 location of manifest object
  4. Click Next
  5. Choose copy options and provide destination bucket location
  6. Select other appropriate options as applicable
  7. In Access Control List, click on Add grantee, enter canonical id which you would have received from destination account side tick boxes for Objects Read and Object ACL Read and Write.
  8. Click Next and provide completion report location, this is the location where s3 batch job would place the result
  9. Provide IAM role ARN, which you created above in Permissions setup
  10. Click Next, review the details and click Clone Job
  11. Refresh the page, once it says Awaiting your confirmation, Run the job and wait for results to stored in completion report location
  12. Review the completion report

Note: Once the data is copied at destination bucket, destination bucket can be disabled and copied object would still be accessible, but as I mentioned in the beginning, to have S3 batch copy working from source to destination, destination bucket must have ACL enabled. Attaching here some of the screenshots for your reference:

Comment here if you have additional questions.

Happy to help.

Abhishek

profile pictureAWS
EXPERT
answered 5 days ago
profile picture
EXPERT
reviewed 4 days ago
  • Thank you so much for this detailed answer, this worked.

2

Hi,

Did you envision to use AWS Datasync to do those cross-account transfers ? It was built exactly for this kind of needs: see https://docs.aws.amazon.com/datasync/latest/userguide/tutorial_s3-s3-cross-account-transfer.html

Enter image description here

Best,

Didier

profile pictureAWS
EXPERT
answered 5 days ago
profile picture
EXPERT
reviewed 4 days ago
  • Thank you for your help, I'll definitely look at this option as well. Why we wanted to go with S3 batch is, we have a use case where we need to do frequent transfer of few specific s3 objects from various prefixes within a bucket and we planned to create manifest file, which contains only those objects that are needed to be transferred. In DataSync, we'll have to copy those objects to DataSync source location first and copying ~1TB of data every time will be an overhead.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions