what is the best way to transfer data from on-prem to s3 over internet

0

what is the best way to transfer data from on-prem to s3 over internet, i have 2 option SFTP or s3 cp/sync. I cannot use datasync since it would require Public IP to deploy datasync agent. The initial data is around 400GB and once every week we will be transferring around 100-150GB data.

GB
已提问 2 个月前387 查看次数
4 回答
1

Hello.

If it is via the Internet, I think it is better to use "aws s3 cp" or "aws s3 sync".
Since you can use multipart upload, you can upload large files at a certain speed.
https://repost.aws/knowledge-center/s3-multipart-upload-cli

For example, if you use a dedicated line connection such as DirectConnect instead of via the Internet, you can also use the following transfer methods.
https://docs.aws.amazon.com/datasync/latest/userguide/s3-cross-account-transfer.html

profile picture
专家
已回答 2 个月前
profile picture
专家
Steve_M
已审核 2 个月前
1

Hello,

Already mentioned, "aws s3 cp" or "aws s3 sync" are best way to transfer data to s3 via internet. To accelerate transfer speed, you can enable CRT(Common Runtime) with below command.

#aws configure set default.s3.preferred_transfer_client crt

https://aws.amazon.com/ko/blogs/storage/improving-amazon-s3-throughput-for-the-aws-cli-and-boto3-with-the-aws-common-runtime/

TA of s3 bucket can also be enabled if it is far from the source location and region.

AWS
已回答 2 个月前
0

As Riku mentioned, aws s3 sync' or aws s3 cp` works well across internet.

Depending on use case, you may want to explore deploying Amazon S3 File Gateway to on-prem.

It provides a file server interface that supports NFS and SMB protocols. S3 File Gateway provides low-latency access to data through transparent local caching. A S3 File Gateway manages data transfer to and from AWS, buffers applications from network congestion, optimizes and streams data in parallel, and manages bandwidth consumption.

You deploy the gateway into your on-premises environment as a virtual machine (VM) running on VMware ESXi, Microsoft Hyper-V, or Linux Kernel-based Virtual Machine (KVM), or as a hardware appliance

Refer to How Amazon S3 File Gateway works for overview.

AWS
专家
Mike_L
已回答 2 个月前
profile picture
专家
Steve_M
已审核 2 个月前
0

Hello,

To use AWS DataSync, you wouldn't need public IP assigned to the agent. DataSync agent VM which is deployed on on-premises would need access to the DataSync public endpoints. DataSync agaent VM can be in a private network with private IP and access DataSync endpoints through a NAT etc. You can also activate the agent to VPC endpoints and use it if you have network connectivity from on-premises to AWS either through DirectConnect or site to site VPN.

https://docs.aws.amazon.com/datasync/latest/userguide/choose-service-endpoint.html

You may consider DataSync further evaluating your use-case. Based on the notes in the question, I would recommend using AWS CLI s3 cp/sync as mentioned above.

psp
已回答 2 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则