Skip to content

(How) can I use datasync to _pull_ data from one region to another in S3?

0

I wish to use datasync to replicate data from a partner's bucket. The partner will grant access to our IAM role. We are thus crossing both regions and accounts.

https://repost.aws/questions/QUjCk76rHXTiiD24kbYRYbXQ/copy-data-cross-account-cross-region-using-datasync says that the config should live in the destination account and region, but the linked tutorial says the opposite.

Is this, in fact, a hard requirement?

Based on an automatically-generated answer, I tried to configure all this in the destination account, using the typescript CDK. The resulting error in CDK deploy was:

11:19:50 AM | CREATE_FAILED | AWS::DataSync::LocationS3 | ThirdbridgeMirror/...ce-bucket-location Resource handler returned message: "Invalid request provided: DataSync location access test failed: could not perform s3:ListObject sV2 on bucket proda-forum-data-delivery-port-analytics. Error: The authorization header is malformed; the region 'us-east-1' is wr ong; expecting 'eu-west-1'. (Service: DataSync, Status Code: 400, Request ID: 1a22926b-49f5-4c65-91cd-56c1dd9e0ee5) (SDK Attempt Co unt: 1)" (RequestToken: d79639be-23b1-20c5-f41d-11dc4ef3f035, HandlerErrorCode: InvalidRequest)

asked 8 months ago190 views
2 Answers
0
Accepted Answer

A DataCopy location object for an S3 bucket must be defined in the same region as the S3 bucket.

To set up a pull configuration using the CDK, the steps are:

  1. Define a role in the destination region to run the task. Set "crossRegionReferences: true" for the stack containing the role.
  2. Define the destination bucket, and give the role access to it.
  3. Give the role access to the source bucket.
  4. Create a stack in the source region: call it 'SourceStack'. Also set "crossRegionReferences: true" for it.
  5. Define an S3 location for the source in SourceStack. You will need the ARN of the role from step 1 and the bucket ARN from step 3.
  6. Now, in the destination region, define an S3 location for the destination bucket and a task to copy the source to the destination. You will need, of course, the source location ARN.
answered 8 months ago
AWS
EXPERT
reviewed 8 months ago
-2

You can use AWS DataSync to pull data from a partner's S3 bucket across both regions and accounts. Here's how to set it up:

When transferring S3 data across accounts and regions using DataSync, you'll need to:

  1. Create DataSync locations that describe where you're transferring data from (source) and to (destination).
  2. Set up appropriate IAM roles with the necessary permissions.

For your specific cross-account, cross-region scenario:

  • The DataSync source and destination locations should be created in your account (the destination account) in the same region as your destination S3 bucket.
  • You'll need an IAM role for the DataSync source location that has access to your partner's S3 bucket. Your partner needs to grant this role access to their bucket.
  • You'll need another IAM role for the DataSync destination location that has access to your S3 bucket.

If the partner's bucket uses SSE-KMS encryption, you'll also need access to their KMS key policy.

Regarding the configuration location, there isn't a hard requirement that the DataSync configuration must live in either the source or destination account/region exclusively. The key factor is ensuring the proper permissions are in place. You can create the DataSync task in your account (the destination account) and pull the data from your partner's bucket, as long as your IAM role has been granted the necessary permissions by your partner.

This approach allows you to "pull" the data from your partner's bucket to yours, maintaining control over the transfer process while only requiring your partner to grant appropriate access permissions.
Sources
Transfer customer managed SSE-KMS encrypted objects across AWS accounts and Regions using AWS DataSync | AWS Storage Blog
Data Migration from One account s3 to another account s3 using data sync | AWS re:Post
Configuring AWS DataSync transfers with Amazon S3 - AWS DataSync

answered 8 months ago
  • When I tried to follow the advice in here, 'cdk deploy' failed with:

    11:19:50 AM | CREATE_FAILED | AWS::DataSync::LocationS3 | ThirdbridgeMirror/...ce-bucket-location Resource handler returned message: "Invalid request provided: DataSync location access test failed: could not perform s3:ListObject sV2 on bucket proda-forum-data-delivery-port-analytics. Error: The authorization header is malformed; the region 'us-east-1' is wr ong; expecting 'eu-west-1'. (Service: DataSync, Status Code: 400, Request ID: 1a22926b-49f5-4c65-91cd-56c1dd9e0ee5) (SDK Attempt Co unt: 1)" (RequestToken: d79639be-23b1-20c5-f41d-11dc4ef3f035, HandlerErrorCode: InvalidRequest)

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.