Migrate on-prem object storage to S3 - Snowball + DataSync


A customer wants to move its IPV installation (with data) from on-prem to AWS.

The storage layer on premise relies on their own object storage solution which has an S3 compatible API (https://docs.ceph.com/en/latest/radosgw/s3/). Now the customer wants to move 70 TB of content to AWS together with the whole IPV suite.

While the compute related part has been solved (they have been in contact with IPV and sized accordingly to their needs) the migration of the data is still open. We have discussed both DataSync and SnowBall.

DataSync can support on prem object store as source and keep the metadata intact and it could work for a full migration and eventually scheduled syncs until the cut-over happens, but moving 70TB takes 8-9 days with a dedicated 1Gbit/s connection. I assume he also needs to purchase a Direct Connect too to ensure this expected speed.

My preferred option for this customer however would be to use SnowBall for the initial heavy bulky migration and then use DataSync to keep the data in sync later on. Does SnowBall allow to copy the on-prem object store with their metadata and move them to S3?

1 Answer
Accepted Answer

With regards to moving data from an on-prem Ceph to AWS, the following are some considerations:

(1) Which version / release of Ceph is this?

(2) Are you trying to move data from rbd, rgw, and cephfs or just rgw?

(3) You can certainly use DataSync to transfer data from rgw but having a reasonably decent connection to move 70 TB would be required if you have a timeline that needs to be met.

(4) You can use Snowball Edge to transfer data from Ceph onto Snowball for the bulk migration, but it will not be able to retain the metadata if you use S3 for the data transfer (you can retain POSIX metadata if you use the File Interface/NFS though that will be quite a lot slower than using the S3 endpoint); with this method, you can use DataSync after the data has been imported into S3 to synchronize the metadata and any updates. Note: S3 on Snowball Edge supports a subset of the S3 API/CLI, so this would need to be tested.

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions