Having an issue with multi-part download from AmazonS3Client

0

I'm trying to download files using multi-parts download but when I combine the parts I get a different sha256 checksum.

Original file uploaded as a single part:

  • File Name: TEST_30_5MB
  • Byte Size: 31981568
  • sha256 Checksum: f3qUbDuXdch/PLhUDEK5/OZNhTG4wmzOru1PauhPk88=

Combine file after download:

  • File Name: MergedParts
  • Byte Size: 31981568
  • sha256 Checksum: 42GXyCQHFNahxIsrCgq2bR3+txwv/DymWipAunrO7S0=

I download it in 10MB chunks. Here is the code I used to download the files:

var fileId = "89155c11-ac31-478d-b0f0-a5185e06f515";

var ranges = new Dictionary<string, ByteRange>
{
    { @"C:\Projects\Test Files\TestPart1", new ByteRange(1, 10485760) },
    { @"C:\Projects\Test Files\TestPart2", new ByteRange(10485761, 20971520) },
    { @"C:\Projects\Test Files\TestPart3", new ByteRange(20971521, 31457280) },
    { @"C:\Projects\Test Files\TestPart4", new ByteRange(31457281, 41943040) },
};

foreach (var range in ranges)
{
    var request = new GetObjectRequest
    {
        BucketName = _s3settings.BucketName,
        Key = fileId,
        ChecksumMode = ChecksumMode.ENABLED,
        ByteRange = range.Value
    };

    var result = await _client.GetObjectAsync(request, token).ConfigureAwait(false);

    using (var fs = File.OpenWrite(range.Key))
    {
        await result.ResponseStream.CopyToAsync(fs);
    }
}

Than I just merged them in CMD with this: copy TestPart1+TestPart2+TestPart3+TestPart4 MergedParts

I also tried started with 0 like this:

ByteRange(0, 10485759)
ByteRange(10485760, 20971519)
ByteRange(20971520, 31457279)
ByteRange(31457280, 41943039)

But that didn't help. Am I calculating the ranges wrong? Am I missing something else?

asked a year ago578 views
2 Answers
2
Accepted Answer

The issue you're facing is likely due to the way you're calculating the byte ranges. In your code, the ranges are overlapping, which is causing corruption in the final merged file.

Here's a better way to calculate the ranges to avoid overlap:

int partSize = 10485760; // 10MB
long fileSize = 31981568; // Size of the original file

List<ByteRange> ranges = new List<ByteRange>();
for (long start = 0; start < fileSize; start += partSize)
{
    long end = Math.Min(start + partSize - 1, fileSize - 1);
    ranges.Add(new ByteRange(start, end));
}

This code will generate non-overlapping byte ranges for your file, ensuring that each part contains unique data. The ranges will look like this:

ByteRange(0, 10485759)
ByteRange(10485760, 20971519)
ByteRange(20971520, 31457279)
ByteRange(31457280, 31981567)

Note that the last range ends at 31981567 because byte ranges are inclusive on both ends.

After downloading the parts using these ranges, you should be able to merge them without any corruption or checksum mismatch.

Also, make sure that you're merging the parts in the correct order. The copy command in CMD will concatenate the files in the order they're listed

answered a year ago
profile picture
EXPERT
reviewed a year ago
0

I see I have to start at zero instead of one. So for this file the bytes are 0 - 31981567 not 1 - 31981568.

I was also having and issue when merging the files using copy. I was getting a file that was 1 byte larger for some odd reason. I merged the part using code, much the same why they will be in the final app, which work perfectly.

Thanks for you help.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions