[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240229162520.970986-1-vanshikonda@os.amperecomputing.com>
Date: Thu, 29 Feb 2024 08:25:12 -0800
From: Vanshidhar Konda <vanshikonda@...amperecomputing.com>
To: Huisong Li <lihuisong@...wei.com>,
Beata Michalska <beata.michalska@....com>
Cc: Vanshidhar Konda <vanshikonda@...amperecomputing.com>,
Ionela Voinescu <ionela.voinescu@....com>,
linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
rafael@...nel.org,
sumitg@...dia.com,
zengheng4@...wei.com,
yang@...amperecomputing.com,
will@...nel.org,
sudeep.holla@....com,
liuyonglong@...wei.com,
zhanjie9@...ilicon.com,
linux-acpi@...r.kernel.org
Subject: [PATCH v1 0/3] arm64: Use AMU counters for measuring CPU frequency
AMU extension was added to Armv8.4 as an optional extension. The
extension provides architectural counters that can be used to measure
CPU frequency - CPU_CYCLES and CNT_CYCLES.
In the kernel FIE uses these counters to compute frequency scale on
every tick. The counters are also be used in the CPPC driver if the
firmware publishes support for registered & delivered registers using
ACPI FFH.
In the current implementation using these counters in the CPPC driver
results in inaccurate measurement in some cases. This has been discussed
in [1] and [2].
In the current implementation, CPPC delivered register and reference
register are read in two different cpc_read calls(). There could be
significant latency between the CPU reading these two registers due to
the core being interrupted - leading to an inaccurate result. Also, when
these registers are in FFH region, reading each register using cpc_read
will result in 2 IPI interrpts to the core whose registers are being read.
It will also wake up any core in idle to read the AMU counters.
In this patch series, there are two changes:
- Implement arch_freq_get_on_cpu() for arm64 to record AMU counters on
every clock tick
- CPPC driver reads delivered and reference registers in a single IPI
while avoiding a wake up on idle core to read AMU counters; also
allows measuring CPU frequency of isolated CPUs
Results on an AmpereOne system with 128 cores after the patch:
When system is idle:
core scaling_cur_freq cpuinfo_cur_freq
[0]: 3068518 3000000
[1]: 1030869 1000000
[2]: 1031296 1000000
[3]: 1032224 1000000
[4]: 1032469 1000000
[5]: 1030987 1000000
..
..
isolcpus = 122-127
[122]: 1030724 1000000
[123]: 1030667 1000000
[124]: 1031888 1000000
[125]: 1031047 1000000
[126]: 1031683 1000000
[127]: 1030794 1000000
With stress applied to core 122-126:
core scaling_cur_freq cpuinfo_cur_freq
[0]: 3050000 3000000
[1]: 1031068 1000000
[2]: 1030699 1000000
[3]: 1031818 1000000
[4]: 1032251 1000000
[5]: 1031282 1000000
..
..
isolcpus = 122-127
[122]: 3000061 3012000
[123]: 3000041 3008000
[124]: 3000038 2998000
[125]: 3000062 2995000
[126]: 3000035 3004000
[127]: 1031440 1000000
[1]: https://lore.kernel.org/all/20230328193846.8757-1-yang@os.amperecomputing.com/
[2]: https://lore.kernel.org/linux-arm-kernel/20231212072617.14756-1-lihuisong@huawei.com/
Vanshidhar Konda (3):
arm64: topology: Add arch_freq_get_on_cpu() support
arm64: idle: Cache AMU counters before entering idle
ACPI: CPPC: Read CPC FFH counters in a single IPI
arch/arm64/kernel/idle.c | 10 +++
arch/arm64/kernel/topology.c | 153 ++++++++++++++++++++++++++++++-----
drivers/acpi/cppc_acpi.c | 32 +++++++-
include/acpi/cppc_acpi.h | 13 +++
4 files changed, 186 insertions(+), 22 deletions(-)
--
2.43.1
Powered by blists - more mailing lists