Why do EC2 instances of the same instance type show different CPU MHz values?

6 minute read
0

I have two Amazon Elastic Compute Cloud (Amazon EC2) instances of the same instance type, such as m5a.xlarge. Even though the instances are the same instance type, they show different CPU MHz values in /proc/cpuinfo.

Resolution

In the /proc/cpuinfo file, don't check CPU clock rate or clock speed information that's measured in hertz (Hz) because the "cpu MHz" behavior is inconsistent. Depending on the cpufreq configuration, the "cpu MHz" is either constant or reports the last requested frequency.

To inspect CPU clock rate or clock speed information, use turbostat to read the frequency reports. For more information, see turbostat on the Red Hat Enterprise Customer Portal.

To invoke turbostat, use one of the following methods:

  • Supply a command: The command is forked, and statistics are printed when the command action completes.
  • Omit the command: turbostat displays statistics every 5 seconds. Use the -interval option to change the interval.

To verify that your instance reaches the maximum advertised CPU MHz, use stress-ng to increase the CPU load when you run the turbostat report. For more information about stress-ng, see Chapter 29. Stress testing real-time systems with stress-ng on the Red Hat Customer Portal.

Example

Note: You can use other purpose-built analysis tools to complete these steps.

In the following example, turbostat is used to check the CPU and simultaneously increase the load with the stress-ng tool.

The m5a.xlarge instance type processor uses the AMD EPYC processor 7571. This EPYC processer 7571 can operate at an all-core turbo clock speed of 2.5 GHz.

1.    Create two SSH sessions to the same instance.

2.    On both SSH shells, run the sudo su command to become the root user.

3.    In the first shell run the turbostat command.

4.    In the second shell, run the following command:

stress-ng -cpu 4 -cpu-method matrixprod -metrics-brief -perf -t 1

5.    Stop the turbostat report in the first shell, and then read the report.

The following example turbostat report is taken when stress-ng runs:

[root@test-server]# turbostat
CoreCPUAvg_MHzBusy%Bzy_MHzTSC_MHzIRQ
--62824.64255022005739
0063124.77254922001494
0263224.78254922001329
1162124.37255022001438
1362824.63255022001478
CoreCPUAvg_MHzBusy%Bzy_MHzTSC_MHzIRQ
--25501002550220020518
002550100255022005221
022550100255022005190
112550100255022005051
132550100255022005056
CoreCPUAvg_MHzBusy%Bzy_MHzTSC_MHzIRQ
--25501002550220020621
002550100255022005162
022550100255022005261
112550100255022005068
132550100255022005130
CoreCPUAvg_MHzBusy%Bzy_MHzTSC_MHzIRQ
--201579.032550220017053
00201278.93255022004262
02202179.28255022004301
11201478.99255022004296
13201278.91255022004194
CoreCPUAvg_MHzBusy%Bzy_MHzTSC_MHzIRQ
--973.825482200973
00943.725472200249
021023.9925482200119
11923.6125492200432
131003.9125492200173
CoreCPUAvg_MHzBusy%Bzy_MHzTSC_MHzIRQ
--953.7125482200939
00943.6825472200246
021034.0525482200305
11863.3625492200256
13963.7625492200132
11863.3625492200256
13963.7625492200132

The preceding example report shows that the Busy% column value increases from 24.64% to 100.00% and then starts to reduce. The CPU gets busier and the Avg_MHz increases from 628MHz to 2550MHz.

The report explains why you see different values. The CPU was idling on 628MHz. When you increased the load on the CPU, the CPU ran at the full CPU capacity of 2550MHz. This confirms that you received the advertised instance performance.

The Avg_MHz column reflects the total number of cycles that were run divided by the measurement interval. If the %Busy column is 100%, then the processor ran at that speed for the entire interval.

The tests confirm that you can't use the proc or sysfs kernel interfaces to inspect the CPU MHz. Instead, use purpose-built analysis tools, such as turbostat and stress-ng.

Related information

Bug 197153 - Constant "cpu MHz" in /proc/cpuinfo on Kernel.org Bugzilla

AWS OFFICIAL
AWS OFFICIALUpdated 6 months ago