Image files downloaded from s3 using powershell read-s3object are garbage

Question

I am writing a powershell script to download image files from S3 storage to an EC2 instance. The script runs as expected, downloading the latest version of each file using Read-S3Object -BucketName $bucketName -Key $object.Key -File $localPath. All images do download, however the images seem to be corrupted. I can see that the file size of the images downloaded with the script are about 1 KB bigger than the ones that I download manually (which I assume accounts for the "corruption")

I can go into S3 storage using a browser or Cloudberry Explorer and download the files manually and all are fine.
Any thoughts anyone?

I can go into S3 storage using a browser or Cloudberry Explorer and download the files manually and all are fine.  
Any thoughts anyone?

Answer

Check your $localPath is valid and the directory structure exists.

Another reason could be encoding/decoding issue, one of the most common issue is writing binary data (like image) in text mode.

Read-S3Object -BucketName $bucketName -Key $object.Key -File $localPath -FileMode "Binary"

You can also refer this link:
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-content-encodings-examples-image-s3.html

Answer

Thanks for taking the time to answer Palvinder. I was thinking it was going to be something like that, that there were some settings that I was unaware of that affected the encoding of a binary file.

In looking deeper it seems using anything other than Cloudberry Explorer to extract the file, even the standard S3 web browser interface, yields a "corrupted" image file. I thought I had ruled out the web browser extraction as having an issue but apparently I was mistaken.

I used VBinnDiff to examine the differences in the files and it appears that the Cloudberry Backup utility has put a "wrapper" around the original file, and that wrapper is only removed by using a cloudberry product to extract the file from S3. This could be due to "Versioning" that is done by Cloudberry and not by the bucket itself.
 
I'll probably have to use CB Explorer to extract the entire folder structure, and then use a modified version of the script to extract what is needed.

Again, thanks for your time looking at this.

Image files downloaded from s3 using powershell read-s3object are garbage

相關內容