I'm running the sync command to transfer data between my EC2 instance and my S3 bucket, but the transfer is slow. How can I troubleshoot this?

8 minute read
0

I'm running the sync command to transfer data between my Amazon Elastic Compute Cloud (Amazon EC2) instance and my Amazon Simple Storage Service (Amazon S3) bucket. However, the transfer is slow. How can I troubleshoot this?

Short description

The sync command on the AWS Command Line Interface (AWS CLI) is a high-level command that includes the ListObjectsV2, HeadObject, GetObject, and PutObject API calls. To identify what might be contributing to the slow transfer:

  • Review the architecture of your use case.
  • Check the network connectivity.
  • Test the speed of uploading to and downloading from Amazon S3.
  • Review the network and resource load while sync runs as a background process.

Resolution

Review the architecture of your use case

Before you test the network connectivity, transfer speeds, and resource loads, consider the following architecture factors that can influence transfer speed:

  • Which Amazon EC2 instance type are you using? For this transfer use case, it's a best practice to use an instance that has a minimum of 10 Gbps throughput.
  • Are the EC2 instance and the S3 bucket in the same AWS Region? It's a best practice to deploy the instance and the bucket in the same Region. It's also a best practice to attach a VPC endpoint for Amazon S3 to the VPC where your instance is deployed.
  • For instances and buckets that are in the same Region, is the AWS CLI configured to use the Amazon S3 Transfer Acceleration endpoint? It's a best practice to not use the Transfer Acceleration endpoint if the resources are in the same Region.
  • What's the nature of the source data set that you want to transfer? For example, are you transferring a lot of small files or a few large files to Amazon S3? For more information about using the AWS CLI to transfer different source data sets to Amazon S3, see Getting the most out of the Amazon S3 CLI.
  • What version of the AWS CLI are you using? Make sure that you’re using the most recent version of the AWS CLI.
  • What's your configuration of the AWS CLI?

If you're still experiencing slow transfers after following best practices, then check the network connectivity, transfer speeds, and resource loads.

Check the network connectivity

Run the dig command on the S3 bucket and review the query response time returned in the Query time field. In the following example, the Query time is 0 msec:

Bash

$ dig +nocomments +stats +nocmd awsexamplebucket.s3.amazonaws.com

;awsexamplebucket.s3.amazonaws.com. IN	A
awsexamplebucket.s3.amazonaws.com. 2400 IN CNAME	s3-3-w.amazonaws.com.
s3-3-w.amazonaws.com.	2	IN	A	52.218.24.66
;; Query time: 0 msec
;; SERVER: 172.31.0.2#53(172.31.0.2)
;; WHEN: Fri Dec 06 09:30:47 UTC 2019
;; MSG SIZE  rcvd: 87

Longer response times for the Domain Name System (DNS) resolution queries to return an IP address can impact performance. If you get a longer query response time, then try changing the DNS servers for your instance. As another network connectivity test, run traceroute or mtr using TCP to the virtual style hostname and the S3 Regional endpoint for your bucket. The request in the following mtr example is routed through a VPC endpoint for Amazon S3 that's attached to the instance's VPC:

Bash

$ mtr -r --tcp --aslookup  --port 443 -c50  awsexamplebucket.s3.eu-west-1.amazonaws.com
Start: 2019-12-06T10:03:30+0000
HOST: ip-172-31-4-38.eu-west-1.co Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS???    ???                 100.0    50    0.0   0.0   0.0   0.0   0.0
  2. AS???    ???                 100.0    50    0.0   0.0   0.0   0.0   0.0
  3. AS???    ???                 100.0    50    0.0   0.0   0.0   0.0   0.0
  4. AS???    ???                 100.0    50    0.0   0.0   0.0   0.0   0.0
  5. AS???    ???                 100.0    50    0.0   0.0   0.0   0.0   0.0
  6. AS???    ???                 100.0    50    0.0   0.0   0.0   0.0   0.0
  7. AS16509  s3-eu-west-1-r-w.am 62.0%    50    0.3   0.2   0.2   0.4   0.0

Test the speed of uploading to and downloading from Amazon S3

1.    Create five test files that contain 2 GB of content:

Bash

$ seq -w 1 5 | xargs -n1 -P 5 -I % dd if=/dev/urandom of=bigfile.% bs=1024k count=2048

$ ls -l
total 10244
-rw-rw-r-- 1 ec2-user ec2-user 2097152 Nov 8 08:14 bigfile.1
-rw-rw-r-- 1 ec2-user ec2-user 2097152 Nov 8 08:14 bigfile.2
-rw-rw-r-- 1 ec2-user ec2-user 2097152 Nov 8 08:14 bigfile.3
-rw-rw-r-- 1 ec2-user ec2-user 2097152 Nov 8 08:14 bigfile.4
-rw-rw-r-- 1 ec2-user ec2-user 2097152 Nov 8 08:14 bigfile.5

2.    Run the sync command using the AWS CLI to upload the five test files. To get the transfer time, insert the time command (from Linux documentation) at the beginning of the sync command:

Note: Be sure to also note the throughput speed while the sync command is in progress.

Bash 

$ time aws s3 sync . s3://awsexamplebucket/test_bigfiles/ --region eu-west-1

Completed 8.0 GiB/10.2 GiB (87.8MiB/s) with 3 file(s) remaining

real 2m14.402s
user 2m6.254s
sys 2m22.314s

You can use these test results as a baseline to compare to the time of the actual sync for your use case.

Review the network and resource load while sync runs as a background process

1.    Append & to the end of the sync command to run the command in the background:

Note: You can also append a stream operator (>) to write output to a text file that you can review later.

Bash

$ time aws s3 sync . s3://awsexamplebucket/test_bigfiles/ --region eu-west-1 \
> ~/upload.log &
[1] 4262
$

2.    While the sync command runs in the background, run the mpstat command (from Linux documentation) to check CPU usage. The following example shows that 4 CPUs are being used and they are utilized around 20%:

Bash 

$ mpstat -P ALL 10
Average:     CPU    %usr   %nice    %sys   %iowait   %irq   %soft  %steal  %guest  %gnice  %idle
Average:     all   21.21    0.00   23.12    0.00    0.00    2.91    0.00    0.00    0.00   52.77
Average:       0   21.82    0.00   21.71    0.00    0.00    3.52    0.00    0.00    0.00   52.95
Average:       1   21.32    0.00   23.76    0.00    0.00    2.66    0.00    0.00    0.00   52.26
Average:       2   20.73    0.00   22.76    0.00    0.00    2.64    0.00    0.00    0.00   53.88
Average:       3   21.03    0.00   24.07    0.00    0.00    2.87    0.00    0.00    0.00   52.03

In this case, the CPU isn't the bottleneck. If you see utilization percentages that are equal to or greater than 90%, then try launching an instance that has additional CPUs. You can also run the top command to review the highest CPU utilization percentages that are running. Try to stop those processes first, and then run the sync command again.

3.    While the sync command runs in the background, run the lsof command (from Linux documentation). This checks how many TCP connections are open to Amazon S3 on port 443:

Note: If max_concurrent_requests is set to 20 for the user profile in the AWS CLI config file, then expect to see a maximum of 20 established TCP connections.

Bash

$ lsof -i tcp:443
COMMAND  PID     USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
aws     4311 ec2-user    3u  IPv4  44652      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:33156->52.218.36.91:https (CLOSE_WAIT)
aws     4311 ec2-user    4u  IPv4  44654      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39240->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user    5u  IPv4  44655      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39242->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user    6u  IPv4  47528      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39244->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user    7u  IPv4  44656      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39246->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user    8u  IPv4  45671      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39248->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   13u  IPv4  46367      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39254->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   14u  IPv4  44657      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39252->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   15u  IPv4  45673      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39250->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   32u  IPv4  47530      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39258->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   33u  IPv4  45676      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39256->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   34u  IPv4  44660      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39266->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   35u  IPv4  45678      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39260->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   36u  IPv4  45679      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39262->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   37u  IPv4  45680      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39268->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   38u  IPv4  45681      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39264->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   39u  IPv4  45683      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39272->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   40u  IPv4  47533      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39270->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   41u  IPv4  44662      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39276->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   42u  IPv4  44661      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39274->52.216.162.179:https (ESTABLISHED)
aws     4311 ec2-user   43u  IPv4  44663      0t0  TCP ip-172-31-4-38.eu-west-1.compute.internal:39278->52.216.162.179:https (ESTABLISHED)

If you see other TCP connections on port 443, then try stopping those connections before running the sync command again.

To get a count of the TCP connections, run this command:

$ lsof -i tcp:443 | tail -n +2 | wc -l
21

4.    After the single sync process is optimized, you can run multiple sync processes in parallel. This avoids single-process slower uploads when high network bandwidth is available, but only half of the network bandwidth is being utilized. When you run parallel sync processes, target different prefixes to get the desired throughput.

For more information, see How can I optimize performance when I upload large amounts of data to Amazon S3?


AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago