How can I use the atop tool to get historical utilization stats for processes on my EC2 Linux instance?

5 minute read

I want to monitor historical resource usage on my Amazon Elastic Compute Cloud (Amazon EC2) instance. How can I use the atop tool to do this?

Short description

The atop tool is a performance monitoring tool that records historical resource usage for later analysis. This tool can also do real-time reporting. You can retrieve usage for CPU utilization, memory consumption, and disk I/O for each process and thread. The atop tool stays active as a background service while it records the stats, allowing for long-term server analysis. The stats are stored for 28 days, by default.

Note: Atop starts logging data only after it's installed. Historical performance data about processes can't be retrieved before the date of atop's installation.


Install atop

For installation instructions, see How do I configure the ATOP and SAR monitoring tools for my EC2 instance running Amazon Linux, RHEL, CentOS, or Ubuntu?

Read atop report logs for historical review and analysis

The atop tool creates log files in /var/log/atop. These files are named in the following format: atop_ccyymmdd. For example, atop_20210902 is the recording for September 2, 2021.

To access the log file, run the command atop -r atoplogfilepath. Replace atoplogfilepath with the full path to the atop log file. The command and log file appear as displayed in the following example:

atop -r /var/log/atop/atop_20210902 

ATOP - ip-172-20-139-91                2021/09/02  17:03:44                ----------------                 3h33m7s elapsed
PRC |  sys    6.51s  |  user   7.85s  |  #proc    103  |  #tslpi    81 |  #tslpu     0  |  #zombie    0  |  #exit      0  |
CPU |  sys     0%  |  user      3%  |  irq       0%  |  idle    197% |  wait      0%  |  ipc notavail  |  curscal   ?%  |
cpu |  sys     0%  |  user      1%  |  irq       0%  |  idle     98% |  cpu000 w  0%  |  ipc notavail  |  curscal   ?%  |
cpu |  sys     0%  |  user      1%  |  irq       0%  |  idle     98% |  cpu001 w  0%  |  ipc notavail  |  curscal   ?%  |

In the preceding output example, the first recorded snapshot was at 2021/09/02 17:03:44. To move forward to the next snapshot, press the t key (lowercase) on the keyboard. To move back to the previous snapshot, press the T key (uppercase).

To analyze a specific time slot, press the b key and then enter the date and time. The atop tool skips to the time specified in the Enter new time variable:

NET |  lo      ----  |  pcki       2  |  pcko       2  |  sp    0 Mbps |  si    0 Kbps  |  so    0 Kbps  |  erro       0  |
Enter new time (format [YYYYMMDD]hhmm):
  PID              TID              RDDSK              WRDSK             WCANCL              DSK             CMD        1/4

Shortcuts keys

You can press shortcut keys to view different statistics. The following are example shortcut keys:

Shortcut keyDescription
gGeneric info (default).
mMemory details.
dDisk details.
nNetwork details. This key works only when the netatop kernel module installed.
cFull command line per process

You can use the following shortcut keys to sort the list of process:

Shortcut keySort by
CCPU activity.
MMemory consumption.
DDisk activity.
NNetwork activity. This key works only if the netatop kernel is installed.
AThe most active system resource (auto mode).

Press the h key to view the help documentation.

The atopsar command

The atopsar command is a feature similar to the traditional UNIX sar command. You can generate various system activity reports using the atopsar command.

The atopsar command uses color coding and (on request) markers to highlight the utilization of a resource. Critical utilization is marked in red and almost critical is marked in cyan.

Using the flag -c in the following example, a report is generated about current CPU utilization of the system. The following example shows two results, one second apart.

$ atopsar -c 1 2

ip-172-20-139-91  4.14.238-182.422.amzn2.x86_64  #1 SMP Tue Jul 20 20:35:54 UTC 2021  x86_64  2021/09/02

-------------------------- analysis date: 2021/09/02 --------------------------

18:50:16  cpu  %usr %nice %sys %irq %softirq  %steal %guest  %wait %idle  _cpu_
18:50:17  all     0     0    0    0        0       0      0      0   200
            0     0     0    0    0        0       0      0      0   100
            1     0     0    0    0        0       0      0      0   100
18:50:18  all     0     0    0    0        0       0      0      0   200
            0     0     0    0    0        0       0      0      0   100
            1     0     0    0    0        0       0      0      0   100

The atopsar command can also analyze historical data. For example, run the following command to generate all reports (-A) starting at 13h00 (-b) and ending at 13h35 (-e) for the current day.

atopsar -A -b 13:00 -e 13:35

You can read the previous days' file using the -r option and specifying the log file name.

Related information

Why is my EC2 Linux instance becoming unresponsive due to over-utilization of resources?

AWS OFFICIALUpdated a year ago