- 最新
- 投票最多
- 评论最多
When AWS DataSync transfers files to Amazon S3, it sometimes applies default system-defined metadata to objects, which can result in unexpected ContentType values. In your case, DataSync is applying "application/octet-stream" as the ContentType for your CSV files instead of the expected "text/csv" or "application/csv".
This behavior is a known characteristic of DataSync. When transferring files from a Linux NFS source to S3, DataSync doesn't always preserve or correctly map the MIME types from the source system to the appropriate S3 ContentType values.
You have a few options to address this:
-
Post-transfer metadata update: After the DataSync task completes, you can use the AWS CLI or SDK to update the ContentType metadata of the transferred objects.
-
Manual transfer with metadata: For critical files where the ContentType is important, you can use the AWS CLI with the appropriate metadata parameters. For example:
aws s3 cp yourfile.csv s3://your-bucket/path/yourfile.csv --content-type "text/csv"
- Custom metadata: When uploading files to S3, you can specify custom metadata including ContentType using the appropriate parameters. This approach requires manual transfer or scripting rather than using DataSync.
DataSync's primary focus is on efficiently transferring data while preserving core file attributes, but it may not always preserve all system-specific metadata like MIME types exactly as they appear on the source system. This is particularly noticeable with ContentType values in S3, where DataSync often defaults to "application/octet-stream" for many file types.
If the correct ContentType is critical for your use case, you may need to implement a post-processing step to update this metadata after the DataSync transfer completes.
Sources
Troubleshooting data verification issues - AWS DataSync
Walkthrough: Attaching POSIX permissions when uploading objects into an Amazon S3 bucket - Amazon File Cache
