Why does S3 objects download in gzip ContentEncoding via CLI?



I can't seem to find this answer, but when I attempt to grab objects from an S3 bucket the metadata I'm getting back is in gzip format. When I try to open the PDF file, it says the file is corrupt.

For example:

aws s3api --profile personal get-object --bucket mys3bucket --key "MyInvoice-001.pdf" MyInvoice-001.pdf
    "AcceptRanges": "bytes",
    "LastModified": "2023-12-29T06:36:03+00:00",
    "ContentLength": 1123267,
    "ETag": "\"309182placeholdertext4318410\"",
    "ContentEncoding": "gzip",
    "ContentType": "application/pdf",
    "ServerSideEncryption": "AES256",
    "Metadata": {}

Since I see the ContentEncoding is set to gzip, I decided to see if I could unzip it...

mv MyInvoice-001.pdf MyInvoice-001.pdf.gz
gunzip MyInvoice-001.pdf.gz

After changing the file extension and unzipping, I can now open the file in Preview.

Is this normal behavior? Do I need my Python code to unzip every gz file when I access a batch of files? Or is there a better way of handling these before processing files?

asked 8 months ago684 views
1 Answer

This article talks about how to override content-encoding with response-content-encoding. Hope this helps. https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3api/get-object.html

answered 8 months ago
profile picture
reviewed 5 months ago
  • Did you also try changing ContentEncoding to “pdf”

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions