Limiting the allowed upload data rate

0

We want to set an upper limit to the allowed data rate when uploading a file to a S3 bucket from an ubuntu PC. Is there a way of doing this directly? If not, is there a recommended way of doing this using some third party application? Note: It is the data rate used to upload a file to AWS S3 we want to limit, not the total data rate of the network interface.

jehake
asked 5 years ago791 views
4 Answers
0

Hi,
Here is one way to do it on Ubuntu.

sudo pip install s3cmd
sudo pip install pv

Then configure s3cmd credentials

s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key:xxxxxxxxxxxxxxxxxxxxxxxx
Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Default Region [US]: us-east-1

Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]:

Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]:

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password: xxxxxxxxxx
Path to GPG program [/usr/bin/gpg]:

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]:

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

New settings:
  Access Key:xxxxxxxxxxxxxxxxxxxxxxxxxxx
  Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxx
  Default Region: us-east-1
  S3 Endpoint: s3.amazonaws.com
  DNS-style bucket+hostname:port template for accessing a bucket: %(bucket)s.s3.amazonaws.com
  Encryption password: xxxxxxxx
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: True
  HTTP Proxy server name:
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] Y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)

Now verifying that encryption works...
Success. Encryption and decryption worked fine :-)

Save settings? [y/N] Y
Configuration saved to '/home/rtakeshi/.s3cfg'

Then upload the file to your bucket with rate-limiting

 cat doggie.jpg | pv -L 100k -q | s3cmd put - s3://kukkichan-xxxxx/doggie.jpg
upload: '<stdin>' -> 's3://kukkichan-xxxxx/doggie.jpg'  [part 1, 2MB]
 2909278 of 2909278   100% in    3s   891.90 kB/s  done

Use "man pv" to get all the various options from Rate Limiting.
Hope this helps!
-randy

answered 5 years ago
0

Thanks for the answer. It works fine! Any reason why s3cmd is used instead of AWS CLI?

Edited by: jehake on Aug 26, 2019 6:06 AM

jehake
answered 5 years ago
0

Hi, good thing you just asked me that question :-) I dug a little deeper and found that this was finally implemented in the s3 aws cli. I also verified that it works using:

aws configure set default.s3.max_bandwidth 100KB/s
C:\Users\randy\Pictures>aws s3 cp doggie.jpg s3://kukkichan-xxxxx/doggie.jpg

Link: https://docs.aws.amazon.com/cli/latest/topic/s3-config.html#configuration-values

max_bandwidth
Default - None
This controls the maximum bandwidth that the S3 commands will utilize when streaming content data to and from S3. Thus, this value only applies for uploads and downloads. It does not apply to copies nor deletes because those data transfers take place server side. The value is in terms of bytes per second. The value can be specified as:
An integer. For example, 1048576 would set the maximum bandwidth usage to 1 MB per second.
A rate suffix. You can specify rate suffixes using: KB/s, MB/s, GB/s, etc. For example: 300KB/s, 10MB/s.
In general, it is recommended to first use max_concurrent_requests to lower transfers to the desired bandwidth consumption. The max_bandwidth setting should then be used to further limit bandwidth consumption if setting max_concurrent_requests is unable to lower bandwidth consumption to the desired rate. This is recommended because max_concurrent_requests controls how many threads are currently running. So if a high max_concurrent_requests value is set and a low max_bandwidth value is set, it may result in threads having to wait unnecessarily which can lead to excess resource consumption and connection timeouts.

-randy

answered 5 years ago
0

Thanks for the update!

The only thing missing now is to be able to limit the data rate in boto3, but that seems not yet to be possible: https://github.com/boto/boto3/issues/1430

jehake
answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions