Questions tagged with AWS DataSync
Content language: English
Sort by most recent
Migrate on-prem object storage to S3 - Snowball + DataSync
A customer wants to move its IPV installation (with data) from on-prem to AWS. The storage layer on premise relies on their own object storage solution which has an S3 compatible API (https://docs.ceph.com/en/latest/radosgw/s3/). Now the customer wants to move 70 TB of content to AWS together with the whole IPV suite. While the compute related part has been solved (they have been in contact with IPV and sized accordingly to their needs) the migration of the data is still open. We have discussed both DataSync and SnowBall. DataSync can support [on prem object store as source] and keep the metadata intact and it could work for a full migration and eventually scheduled syncs until the cut-over happens, but moving 70TB takes 8-9 days with a dedicated 1Gbit/s connection. I assume he also needs to purchase a Direct Connect too to ensure this expected speed. My preferred option for this customer however would be to use SnowBall for the initial heavy bulky migration and then use DataSync to keep the data in sync later on. Does SnowBall allow to copy the on-prem object store with their metadata and move them to S3? : https://aws.amazon.com/blogs/aws/aws-datasync-adds-support-for-on-premises-object-storage/
What's the fastest way to move a 2TB Oracle Database to EC2?
Customer has a 2TB Oracle Database that they need to lift and shift to AWS. This 2TB Database is only **one** schema and size of the backup. What would be *fastest* way with *minimum downtime*? Clarification: - Customer has Site-to-Site VPN
Move files from S3 to FSx for Windows
A customer has files on S3 (provided by Transfer Family) and they need to copy/move files uploaded to S3 to their FSx for Windows. This could be done with a dedicated machine to do this job. However, I wonder whether it is possible to solve this without the need to manage a dedicated EC2 instance. My best guess so far is to use DataSync. I know that I can't use the combination "Location S3, destination FSx". However, the combination "Location S3, destination SMB" is possible. 1) Is it possible to use DataSync for the combination "Location S3, destination SMB (and this is actually FSx)"? 2) Are there any pitfalls to solve the customer's problem in this way? 3) Do you see a better solution to move files from S3 to FSx?
How does DataSync determine if a file has changed?
Example use case is a daily data export locally that needs to sync to S3. There is no way to check what has changed before exporting, so must export the full dataset. Aim is to only upload what has changed to S3. DataSync seems like it will work (it will only be one-way transfer daily so choosing DataSync over Transfer or File Gateway) but how does it determine what has changed? I found the [docs] which say: > In the PREPARING status, DataSync examines the source and destination file systems to determine which files to sync. It does so by recursively scanning the contents of the source and destination file systems for differences. Will this scan the contents of the files themselves, or just the filesystem metadata? Anyone know anymore details on how it determines what has changed? : https://docs.aws.amazon.com/datasync/latest/userguide/how-datasync-works.html#transfering-files
DataSync Agent -- What user does it run as?
A customer is running a POC of DataSync to sync files from an Isilon file system to S3. They have strict audit logging requirements and would like to configure the DataSync agent to access the file system via NFS using a specific user in order to track its access in their audit logs. What user does the DataSync Agent run as? Can it be configured to run as a different user or to access the NFS mount as a specific user?
Copy data from EFS to EFS in another AWS account.
Hi Everyone, I'm looking for some guidance on copying EFS data from one AWS account to another AWS account. From what what I understand from reading other posts, I need to choose the source as NFS and the destination as EFS. How do I go about referencing the EFS share in the destination location? It won't show in the drop-down list as it resides in another AWS account which makes sense. Do I need to create a peering connection between the two VPC's? Or is there something else i'm missing? My agent is activated via a public endpoint if that matters any. Thanks for any help provided, Craig.
Question on overwrite and multiagent syncing
Hi. I had a question on the "Overwrite files" flag, which I wasn't sure about, and cant find an exact answer on. Does enabling this flag overwrite each and every file that it finds in the source to the destination, or does it overwrite ONLY changed files based on the checksum? If it's the former, which options should I choose so that AWS only uploads new and changed files (to accommodate a case where eg. only half the file has been pushed into S3 before the agent went down)? One other question. Is it possible to have 2 x Agents syncing from 2 different locations into one destination bucket? Thanks.
Meta data not retained using DataSync.
A customer is migrating the data from on-prem to AWS S3. Customer would like to retain meta data information i.e. file created date, modified date after migrating to the S3. As the data size is almost 25TB, and have very good bandwidth, planing to use data sync. As per data sync documentation given below, I thought customer can retain the meta data information which they used for Audit purpose. When copying between an NFS server and Amazon S3 – In this case, the following metadata is stored as Amazon S3 user metadata: File and folder modification timestamps After migration, I see the date and time migrated but not the original created date. How to retain the actual file dates after migrating to the S3 bucket?
VPC Private Endpoint Service for Datasync
A customer is going to use Datasync to migrate data on-prem to S3 bucket through private network, DX connection has been established between on-prem and aws VPC. In Datasync, we can create VPC private endpoint according to our documentation: https://docs.aws.amazon.com/datasync/latest/userguide/datasync-in-vpc.html The DataSync agent will be deployed in EC2 in the VPC, I wonder if we also need to create another VPC endpoint for S3 to ensure end-to-end traffic remain in private traffic.
How do DataSync incremental transfers work?
A customer is planning to use DataSync to copy data to EFS and a question came up about incremental transfers: If a single byte is changed on a source file after an initial transfer, does DataSync transfer the entire file or just the changed bytes during a subsequent transfer? The [documentation] states only files that have been added, modified, or deleted are transferred. This suggests to me that an incremental transfer copies an entire file if it has changed, not just a delta of changed bytes. Is that the case or does it work like rsync and transfers only the changed bytes? : https://docs.aws.amazon.com/datasync/latest/userguide/how-datasync-works.html
Datasync EFS to S3 not working
Hi AWS team, I am trying to following https://aws.amazon.com/premiumsupport/knowledge-center/datasync-transfer-efs-s3/ to transfer data from EFS to S3 but currently I am unable to select S3 as destination location. The only options available are EFS & SMB. Not sure why this is happening. I have tried to create locations separately but even then S3 location is not present for destination. Datasync agent is created as EC2 & activated. I have tried with & without endpoints but still unable to find S3 when selecting destination location. Attached are few screenshots. Regards, Vikas Edited by: vikassachdeva on Apr 6, 2020 5:52 AM
AWS DataSync - Control Traffic Details
A customer is working to enable AWS DataSync to be consumed. A question from the Security Architect is what does the control traffic on 1024-1064 include? Is this traffic encrypted ? I do understand this is a communication requirement between the DataSync Agent EC2/VM and DataSync Service per https://docs.aws.amazon.com/datasync/latest/userguide/datasync-network.html Thanks