Upload Large Amount of Data Using Presigned S3 Upload URL and Object Key

0

We require the ability to upload a large amount of data by users that are not employees of our company by providing them temporary access with a preset time limit. The data sets are known upfront including total size, filenames, etc. however the data sets are hundreds to thousands of files (images) captured by drones. We need to generate a presigned upload URL that can be used with the AWS CLI and a specific object path / key. For example:

{bucket_name}/path/to/imagery/image_001.jpg {bucket_name}/path/to/imagery/image_002.jpg {bucket_name}/path/to/imagery/image_003.jpg {bucket_name}/path/to/imagery/image_004.jpg {bucket_name}/path/to/imagery/image_005.jpg {bucket_name}/path/to/imagery/image_....jpg

I am not entirely sure that this is possible. I know it is possible to do one specific file however, that will not work for us as previously mentioned. We do have the individual file names that need to be uploaded and their size but I would prefer to generate a presigned URL that can be used with the AWS S3 sync command via the AWS CLI that just has a destination the files should be uploaded to and allow the user to provide their local path.

Is this possible?

1 Answer
0

Presigned URLs can only be used for a single object so may not be the best thing to use here.

An alternative is to have the user create an archive (ZIP or otherwise) with all the files in it; upload to S3 as a single transaction; then you can extract the archive once the upload has completed. This has a desirable (for some customer) outcome that the upload destination and the final file destination are different and also allow you to scan the contents for correctness or other attributes before they are used by your systems.

The AWS CLI (using the s3 cp or s3 sync commands is a valid alternative here but it requires you to distribute credentials (in some form) to the users. So there is friction there - both in the installation of the CLI and the creation/distribution/maintenance of the credentials. If the copy operations are something to be repeated then it may be a good way to go.

One way to securely distribute temporary credentials is to use IAM Roles Anywhere - your users would hold a certificate (generated and issued by you) from which credentials are vended which grant (or revoke) access to the bucket in question.

profile pictureAWS
EXPERT
answered a year ago
  • Limiting a reply to 600 characters this day in age is ridiculous! Anyways, thank you for your reply unfortunately, there's some issues here. Currently, operators upload images via a custom built web app with drawbacks: lack of ability to restart a previously failed upload is the biggest. Operators fly fields in the middle of nowhere with bad cell service and Internet. Data sets must be uploaded within 24 hours. With the ability to use the AWS CLI s3 sync command, the operator would be able to restart the upload if it fails and only upload the files that had not yet been uploaded.

  • We also cannot rely on the users to create a ZIP file prior to upload due to the size of the imagery, the local storage required and the time it would take to create the ZIP file before upload. Also, without being able to upload the image sets to a specific path / key and control the access would not be safe either because the operators tend to not pay much attention to detail. We have to be able to specifically control the access and destination of all images in the data sets.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions