I store various files in an S3 bucket which I'd like to compress. Some using Gzip and some using Brotli. For the Gzip case, I set Content-Encoding
as gzip
and for the Brotli case, I set it to br
. The files have the corresponding suffixes, i.e. .gz
for Gzip-compressed file and .br
for Brotli-compressed file. The problem is that when I download the files using Amazon S3 console, both types of files are correctly decompressed, but only the Gzip-compressed files have their suffix removed. E.g. when I download file1.json.gz
(which has Content-Type
set to application/json
and Content-Encoding
set to gzip
), it gets decompressed and saved as file1.json
. However, when I download file2.json.br
(with the Content-Type
set to application/json
and Content-Encoding
set to br
), the file gets decompressed but another .json
suffix is added so the file is saved as file2.json.json
. I tried to also set Content-Disposition
to contain attachment; filename="file2.json"
but this doesn't help. So, I have a couple of questions:
- What's the correct way how to store the compressed files in S3 to achieve a consistent handling? According to
PutObject
API it seems, that Content-Encoding
is what specified that files has been compressed using a specific algorithm and that it needs to be decompressed when accessed by the client, so it seems that the file extension (e.g. .br
) is not needed. However, some services, e.g. Athena explicitely state that they need the files to have proper extension to be treated like a compressed files.
- Is Gzip-compressed file handled differently than other types (e.g. Brotli)? And if so, why and is that browser or S3 which initiates this different handling?