Dotnet SDK TransferUtility doesn't calculate PartSize properly

0

I am trying to ensure that file downloaded from s3 matches etag.

When uploading file, I am using TransferUtility with specified PageSize (16 * 1024* 1024). Once downloaded, I am calculating MD5 for first 16MB, store it into byte list, than calculate MD5 for remaining part and adds to same list.

Whenever I am using MemoryStream as a source everything works perfrectly and etag matches. However, once I am using CryptoStream to encrypt content on the fly it is not. Debug information shows that Transfer utility incorrectly processes PartSize and adds extra 8k to portion.

Here is an excerpt from log:

TransferUtility 444|2024-02-26T19:37:30.234Z|DEBUG|Beginning upload of stream
TransferUtility 445|2024-02-26T19:37:30.234Z|DEBUG|Upload part size 16777216.
TransferUtility 448|2024-02-26T19:37:31.888Z|DEBUG|Uploaded part 1. (Last part = False, Part size = 16785408, Upload Id: 1gLA0rx8CX0qveZGZoqHE1t_WE7qj9MiwkVC13qKC0IkFOQUWv_Qzs9uDRn3rLwZzT6QxPr7871_HA0mSn66h81xxD1ttH5pLXLUUIVHQB6yaJaNRUoNfZ68_r4wEOBL)

As you can see, 16785408 is not equal to 16777216. It is exactly 8K more. Apparently, if on downloading I split file into 16MB chunks to calculate MD5 I do not have a match.

Some code:

// creating a stream which encrypt Stream named input 
Aes aesAlg = Aes.Create();
aesAlg.GenerateKey();
aesAlg.GenerateIV();

ICryptoTransform encryptor = aesAlg.CreateEncryptor(aesAlg.Key, aesAlg.IV);
CryptoStream cryptoStream = new CryptoStream(input, encryptor, CryptoStreamMode.Read);
// now it is possible to read from this stream in a way that content is encrypted on the fly, without putting whole file and encrypted data into a memory

using var transferUtility = new TransferUtility(_s3Client);
var transferRequest = new TransferUtilityUploadRequest()
{
    BucketName = _bucket,
    Key = key,
    InputStream = cryptoStream ,
    ContentType = "application/x-binary",
    CannedACL = S3CannedACL.Private,
    PartSize = 16 * 1024 * 1024,
};
 await transferUtility.UploadAsync(transferRequest, cancellationToken);

For now, my workaround is to use PartSize 16MB + 8K whenever I calculate ETag, but I hope you can fix it.

asked 2 months ago94 views
1 Answer
0

The issue you're encountering seems to be related to how TransferUtility processes the part size when uploading a stream with encryption using a CryptoStream. TransferUtility may be adding extra padding to the part size, causing a mismatch in the uploaded file's ETag.

To address this issue, you can try the following workaround:

Instead of relying on TransferUtility to calculate the part size, manually specify the part size in the TransferUtilityUploadRequest to ensure consistency. You can set the part size to 16MB + 8KB to match the behavior observed during the upload.

Here's how you can modify your code to implement this workaround:


using System.IO;
using System.Security.Cryptography;
using Amazon.S3;
using Amazon.S3.Transfer;

// Create an Aes object for encryption
Aes aesAlg = Aes.Create();
aesAlg.GenerateKey();
aesAlg.GenerateIV();

// Create an ICryptoTransform object for encryption
ICryptoTransform encryptor = aesAlg.CreateEncryptor(aesAlg.Key, aesAlg.IV);

// Create a CryptoStream for encryption
CryptoStream cryptoStream = new CryptoStream(input, encryptor, CryptoStreamMode.Read);

// Create an instance of AmazonS3Client
IAmazonS3 s3Client = new AmazonS3Client();

// Create a TransferUtility object with the AmazonS3Client
using var transferUtility = new TransferUtility(s3Client);

// Create a TransferUtilityUploadRequest object
var transferRequest = new TransferUtilityUploadRequest()
{
    BucketName = _bucket,
    Key = key,
    InputStream = cryptoStream,
    ContentType = "application/x-binary",
    CannedACL = S3CannedACL.Private,
    PartSize = (16 * 1024 * 1024) + (8 * 1024), // Set part size to 16MB + 8KB
};

// Upload the encrypted stream using TransferUtility
await transferUtility.UploadAsync(transferRequest, cancellationToken);
profile picture
EXPERT
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions