- Newest
- Most votes
- Most comments
The CLI is multi-threaded, and will re-use TCP connections (which is what I think is what you are asking).
There are a number of tuneables if you want to speed up uploads to use all available bandwidth. The most effective is to ensure that you turn the CLI to use as many threads as possible, and multi-part uploads (if your individual files are large enough).
Internally the CLI will create a queue, and dispatch that queue to one of the sessions it has open, until it reaches whatever you have specified for max_concurrent_sessions
. Refer to the documentation on setting the CLI for maximum performance: https://docs.aws.amazon.com/cli/latest/topic/s3-config.html for your particular environment.
Please note also that many operating systems also place a limit on the number of open file-descriptors for a given user. This will obviously prevent the CLI from opening more connections than the limit that is enforced by the operating system. As an example, right now, I am using MacOS, and if I check my users open file-descriptor limit it is:
% ulimit -n
256
Which is quite low considering it is shared by all the processes that are running as my user.
Amazon S3 only supports HTTP 1.1 which has a well known limitation that it cannot send multiple HTTP requests over a single TCP connection. When the client opens a TCP connection, it has to send the request and wait for the response. Hence, for this reason it is not possible to download multiple files in parallel over a single TCP connection, and instead it will establish a new connection for each file.
That being said, AWS S3 transfer commands are multithreaded. At any given time, multiple requests to Amazon S3 are in flight and the TCP connections can also be re-used. Therefore, reusing TCP connections for multiple files, and having multiple concurrent connections, is best practice for S3 in general. Please refer below documentation to know more :
Lastly, you can also customize the AWS CLI configurations for Amazon S3 in AWS Config File which has a default location of ~/.aws/config to optimize performance. Please refer below link :
Relevant content
- asked 5 months ago
- asked 5 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
Guarav, what you say about HTTP 1.1 is not true, by default it allows for connection re-use, and pipelining of requests (which the CLI does not use). If no explicit "Connection: Close" header is received to a request made over an HTTP 1.1 session, then the connection is left open, ready to accept another HTTP request. Both the client (CLI) and server can alter this behavior by enclosing a "Connection" header. What you describe is the default behavior of the HTTP 1.0 protocol. Please read https://datatracker.ietf.org/doc/html/rfc7230#page-50 specifically on Page 50 "Connection Management"