Dotnet SDK TransferUtility doesn't calculate PartSize properly

0

I am trying to ensure that file downloaded from s3 matches etag.

When uploading file, I am using TransferUtility with specified PageSize (16 * 1024* 1024). Once downloaded, I am calculating MD5 for first 16MB, store it into byte list, than calculate MD5 for remaining part and adds to same list.

Whenever I am using MemoryStream as a source everything works perfrectly and etag matches. However, once I am using CryptoStream to encrypt content on the fly it is not. Debug information shows that Transfer utility incorrectly processes PartSize and adds extra 8k to portion.

Here is an excerpt from log:

TransferUtility 444|2024-02-26T19:37:30.234Z|DEBUG|Beginning upload of stream
TransferUtility 445|2024-02-26T19:37:30.234Z|DEBUG|Upload part size 16777216.
TransferUtility 448|2024-02-26T19:37:31.888Z|DEBUG|Uploaded part 1. (Last part = False, Part size = 16785408, Upload Id: 1gLA0rx8CX0qveZGZoqHE1t_WE7qj9MiwkVC13qKC0IkFOQUWv_Qzs9uDRn3rLwZzT6QxPr7871_HA0mSn66h81xxD1ttH5pLXLUUIVHQB6yaJaNRUoNfZ68_r4wEOBL)

As you can see, 16785408 is not equal to 16777216. It is exactly 8K more. Apparently, if on downloading I split file into 16MB chunks to calculate MD5 I do not have a match.

Some code:

// creating a stream which encrypt Stream named input 
Aes aesAlg = Aes.Create();
aesAlg.GenerateKey();
aesAlg.GenerateIV();

ICryptoTransform encryptor = aesAlg.CreateEncryptor(aesAlg.Key, aesAlg.IV);
CryptoStream cryptoStream = new CryptoStream(input, encryptor, CryptoStreamMode.Read);
// now it is possible to read from this stream in a way that content is encrypted on the fly, without putting whole file and encrypted data into a memory

using var transferUtility = new TransferUtility(_s3Client);
var transferRequest = new TransferUtilityUploadRequest()
{
    BucketName = _bucket,
    Key = key,
    InputStream = cryptoStream ,
    ContentType = "application/x-binary",
    CannedACL = S3CannedACL.Private,
    PartSize = 16 * 1024 * 1024,
};
 await transferUtility.UploadAsync(transferRequest, cancellationToken);

For now, my workaround is to use PartSize 16MB + 8K whenever I calculate ETag, but I hope you can fix it.

질문됨 2달 전101회 조회
1개 답변
0

The issue you're encountering seems to be related to how TransferUtility processes the part size when uploading a stream with encryption using a CryptoStream. TransferUtility may be adding extra padding to the part size, causing a mismatch in the uploaded file's ETag.

To address this issue, you can try the following workaround:

Instead of relying on TransferUtility to calculate the part size, manually specify the part size in the TransferUtilityUploadRequest to ensure consistency. You can set the part size to 16MB + 8KB to match the behavior observed during the upload.

Here's how you can modify your code to implement this workaround:


using System.IO;
using System.Security.Cryptography;
using Amazon.S3;
using Amazon.S3.Transfer;

// Create an Aes object for encryption
Aes aesAlg = Aes.Create();
aesAlg.GenerateKey();
aesAlg.GenerateIV();

// Create an ICryptoTransform object for encryption
ICryptoTransform encryptor = aesAlg.CreateEncryptor(aesAlg.Key, aesAlg.IV);

// Create a CryptoStream for encryption
CryptoStream cryptoStream = new CryptoStream(input, encryptor, CryptoStreamMode.Read);

// Create an instance of AmazonS3Client
IAmazonS3 s3Client = new AmazonS3Client();

// Create a TransferUtility object with the AmazonS3Client
using var transferUtility = new TransferUtility(s3Client);

// Create a TransferUtilityUploadRequest object
var transferRequest = new TransferUtilityUploadRequest()
{
    BucketName = _bucket,
    Key = key,
    InputStream = cryptoStream,
    ContentType = "application/x-binary",
    CannedACL = S3CannedACL.Private,
    PartSize = (16 * 1024 * 1024) + (8 * 1024), // Set part size to 16MB + 8KB
};

// Upload the encrypted stream using TransferUtility
await transferUtility.UploadAsync(transferRequest, cancellationToken);
profile picture
전문가
답변함 2달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인