Questions tagged with Amazon Simple Storage Service

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

S3 – file extension and metadata for compressed files

I store various files in an S3 bucket which I'd like to compress. Some using Gzip and some using Brotli. For the Gzip case, I set `Content-Encoding` as `gzip` and for the Brotli case, I set it to `br`. The files have the corresponding suffixes, i.e. `.gz` for Gzip-compressed file and `.br` for Brotli-compressed file. The problem is that when I download the files using Amazon S3 console, both types of files are correctly decompressed, but only the Gzip-compressed files have their suffix removed. E.g. when I download `file1.json.gz` (which has `Content-Type` set to `application/json` and `Content-Encoding` set to `gzip`), it gets decompressed and saved as `file1.json`. However, when I download `file2.json.br` (with the `Content-Type` set to `application/json` and `Content-Encoding` set to `br`), the file gets decompressed but another `.json` suffix is added so the file is saved as `file2.json.json`. I tried to also set `Content-Disposition` to contain `attachment; filename="file2.json"` but this doesn't help. So, I have a couple of questions: - What's the correct way how to store the compressed files in S3 to achieve a consistent handling? According to [`PutObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html#API_PutObject_RequestSyntax) API it seems, that `Content-Encoding` is what specified that files has been compressed using a specific algorithm and that it needs to be decompressed when accessed by the client, so it seems that the file extension (e.g. `.br`) is not needed. However, some services, e.g. [Athena](https://docs.aws.amazon.com/athena/latest/ug/compression-formats.html) explicitely state that they need the files to have proper extension to be treated like a compressed files. - Is Gzip-compressed file handled differently than other types (e.g. Brotli)? And if so, why and is that browser or S3 which initiates this different handling?
0
answers
0
votes
13
views
asked 14 days ago

Multipart upload with aws S3 + checksums

I am trying to implement browser multipart upload to a S3 bucket. I should be able to pause and play the upload and also I'll like to automatically generate the checksums as I'm uploading. I have tried several approaches and I've been hitting a wall. Some of the approaches I've tried. * Using the amplify S3 upload, this works well, but has the caveat that I can't generate the checksums automatically, to generate the checksums, I run a lambda function after file upload, the caveat is for large files, the lambda function times out. Also, I'll like to avoid going this route as I believe It's quite computationally expensive. * Using https://blog.logrocket.com/multipart-uploads-s3-node-js-react/. This is also similar to the above, the caveat is when I add the checksum algorithm to the upload part query, I get a **checksum type mismatch occurred, expected checksum type sha256, actual checksum type: null site:stackoverflow.com s3**. After a lot of googling, I'm not sure I can compute the checksums using presigned url. * and the current approach is to do away with the presigned url and send the chunked data to the lambda functions which then sends to the bucket. Since I'm managing everything with amplify, I run into some problems with API gateway(multipart/form-data). I have set the gateway to accept binary data and followed other fixes I found online but I’m stuck on **execution failed due to configuration error unable to transform request**. How do I fix the above error and what will be the ideal approach to implement the functionalities(multipart file upload to support resumable uploads and checksum computation)
0
answers
0
votes
23
views
asked 20 days ago

Mounting a file system to Github actions.

I am attempting to shift a workflow into the cloud. So that I can keep costs down I am using Github actions to do some Mac specific stuff - build macOS install packages. This is done using a tool - autopkg. Autopkg caches the application download and package between runs. Unfortunately this cache is too large for Github and can include files too big for Github actions. Package building has to happen on a Mac. Since the next step is to do some uploading of the packages to multiple sites and run some Python to process th built packages and this can run on a small Linux EC2 instance it seems the logical solution is to provide a file system from AWS that autopkg can use as a cache and mount it on every Github action run. I have been tearing my hair out attempting this with either S3 and S3fs or EFS and can't seem to wrap my head around how all the bits hang together. For testing I tried the mount native on my Mac and I tried it in amazonlinux and Debian Docker containers. I'm figuring the solution will be using NFS or efs-utils to mount an EFS volume but I can't get it working. In a Debian container using efs-utils I got close but it seems I can't get the DNS name to resolve. The amazonlinux Docker container was too basic to get efs-utils to work. I also got the aws command line tool installed but it runs in to the same DNS resolution problems. I tried connecting the underlying Mac to an AWS VPN in the same VPC as the file system. still had the same DNS problems. Any help would be appreciated. I've just updated the question with some more stuff I have tried.
0
answers
0
votes
10
views
asked 20 days ago