What tools can I use with EC2Rescue for Linux to troubleshoot performance bottlenecks within my instances?

5 minute read
0

I want to troubleshoot performance bottlenecks within my Amazon Elastic Compute Cloud (Amazon EC2) instances that run Amazon Linux 1 or 2, Red Hat Enterprise Linux (RHEL), or CentOS.

Short description

To determine where performance bottlenecks occur, use the bbc framework in eBPF with the tools in EC2Rescue for Linux.

Note: This resolution doesn't apply to instances that run Debian or Ubuntu.

Resolution

Install the bcc package for your OS

Complete the following steps:

  1. Use SSH to connect to your instance.

  2. Run the following command to install the bcc package for Amazon Linux instances:

    $ sudo yum install bcc

    Note: For download and installation instructions for distributions other than Amazon Linux, refer to your distribution's documentation.

  3. Run the following command to add the bcc tools in the PATH variable:

    $ sudo -s
    # export PATH=$PATH:/usr/share/bcc/tools/

    Note: For EC2 Rescue for Linux to run the bcc tools, the tools must be in the PATH variable on your operating system (OS).

  4. As a best practice, permanently add the PATH setting to your Linux system. For Amazon Linux, complete the following steps:
    Open ~/.bash_profile using the vi editor.
    Run the following command:

    # vi ~/.bash_profile

    Add /usr/share/bcc/tools to the PATH variable:

    PATH=$PATH:$HOME/bin:/usr/share/bcc/tools

    Save the file, and then exit the vi editor.
    Run the following command to source the updated profile:

    #source ~/.bash_profile

    Note: The steps to permanently add the PATH setting vary depending on your Linux distribution.

  5. Download and install the EC2Rescue for Linux tool, and then navigate to the installation directory on your instance.

Use bcc-based modules with EC2Rescue for Linux

Each of the following modules runs for the specified period and collects output a specified number of times. 

CPU performance tools

The bccsoftirqs.yaml module runs the softirqs tool that traces soft interrupts (IRQs), and then stores timing statistics in kernel. You can use --period to provide an interval, and --times for a count. The tool automatically shows the timestamps for each time it runs. For more information, see aws-ec2rescue-linux/mod.d/bccsoftirqs.yaml on the GitHub website.

The bccrunqlat.yaml module shows the time that tasks are waiting to run on CPU. Results are shown as a histogram. For more information, see aws-ec2rescue-linux/mod.d/bccrunqlat.yaml on the GitHub website.

Block I/O performance tools

The bccbiolatency.yaml module traces block device I/O and records the distribution of I/O latency per disk device. Results are shown as a histogram. For more information, see aws-ec2rescue-linux/mod.d/bccbiolatency.yaml on the GitHub website.

The bccext4slower.yaml module uses the ext4slower tool to collect output. The tool traces ext4 reads, writes, opens, and fsyncs that are slower than a threshold of 10 milliseconds. For more information, see aws-ec2rescue-linux/mod.d/bccext4slower.yaml on the GitHub website.

Note: You can use the bccxfsslower module similarly to bccext4slower.yaml for XFS file systems. For more information, see aws-ec2rescue-linux/mod.d/bccxfsslower.yaml on the GitHub website.

The bccfileslower.yaml module uses fileslower to collect output. The tool traces file-based synchronous reads and writes that are slower than a default threshold of 10 milliseconds. For more information, see aws-ec2rescue-linux/mod.d/bccfileslower.yaml on the GitHub website.

Network performance tools

The bcctcpconnlat.yaml module traces the kernel function that performs active TCP connections, such as through a connect() syscall. The results show the latency for the connection. Latency is locally measured as the time from SYN that's sent to the response packet for a specified period. TCP connection latency indicates the time that it takes to establish a connection. For more information, see aws-ec2rescue-linux/mod.d/bcctcpconnlat.yaml on the GitHub website.

The bcctcptop.yaml module shows TCP connection throughput per host and port for the specified period and times and doesn't clear the screen. For more information, see aws-ec2rescue-linux/mod.d/bcctcptop.yaml on the GitHub website.

The bcctcplife.yaml module summarizes TCP sessions that open and close when the session is tracing. For more information, see aws-ec2rescue-linux/mod.d/bcctcplife.yaml on the GitHub website.

Example of command and output

After you run a module on your instance, you can find the output of these modules under the /var/tmp/ec2rl directory. The following example includes the command and output from the bcctcptop module with the period parameter set to 5 and the times parameter set to 2:

# ./ec2rl run --only-modules=bcctcptop --period=5 --times=2
        
# cat /var/tmp/ec2rl/2020-04-20T21_50_01.177374/mod_out/run/bcctcptop.log 
I will collect tcptop output from this alami box 2 times.
Tracing... Output every 5 secs. Hit Ctrl-C to end
21:50:17 loadavg: 0.74 0.33 0.17 5/244 4285
PID    COMM         LADDR                 RADDR                  RX_KB  TX_KB
3989   sshd         172.31.22.238:22      72.21.196.67:26601         0      9
21:50:22 loadavg: 0.84 0.36 0.18 4/244 4285
PID    COMM         LADDR                 RADDR                  RX_KB  TX_KB
3989   sshd         172.31.22.238:22      72.21.196.67:26601         0     11
2731   amazon-ssm-a 172.31.22.238:54348   52.94.225.236:443          5      4
2938   amazon-ssm-a 172.31.22.238:58878   52.119.197.249:443         0      0

To upload the output to AWS Support, run the following command:

# ./ec2rl upload --upload-directory=/var/tmp/ec2rl/2020-04-20T21_50_01.177374 --support-url="URLProvidedByAWSSupport"

Note: The quotation marks in the preceding command are required. If you use sudo to run the tools, then use sudo to upload the output. To use an Amazon Simple Storage Service (Amazon S3) presigned URL to upload the output, run the help upload command and follow the instructions.

Related information

How do I diagnose high CPU utilization on an EC2 Windows instance?

AWS OFFICIAL
AWS OFFICIALUpdated 3 months ago