lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ff65b997-eb14-798f-1d2f-c375bf763e71@hisilicon.com>
Date: Fri, 20 Jun 2025 11:53:47 +0800
From: Jie Zhan <zhanjie9@...ilicon.com>
To: Prashant Malani <pmalani@...gle.com>
CC: Ben Segall <bsegall@...gle.com>, Dietmar Eggemann
	<dietmar.eggemann@....com>, Ingo Molnar <mingo@...hat.com>, Juri Lelli
	<juri.lelli@...hat.com>, open list <linux-kernel@...r.kernel.org>, "open
 list:CPU FREQUENCY SCALING FRAMEWORK" <linux-pm@...r.kernel.org>, Mel Gorman
	<mgorman@...e.de>, Peter Zijlstra <peterz@...radead.org>, "Rafael J. Wysocki"
	<rafael@...nel.org>, Steven Rostedt <rostedt@...dmis.org>, Valentin Schneider
	<vschneid@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, Viresh
 Kumar <viresh.kumar@...aro.org>, Ionela Voinescu <ionela.voinescu@....com>,
	Beata Michalska <beata.michalska@....com>, z00813676
	<zhenglifeng1@...wei.com>
Subject: Re: [PATCH v2 2/2] cpufreq: CPPC: Dont read counters for idle CPUs

Hi Prashant,

Thanks for spotting this.
Cc'd a few more developers.

Jie

On 19/06/2025 08:09, Prashant Malani wrote:
> AMU performance counters tend to be inaccurate when measured on idle CPUs.
> On an idle CPU which is programmed to 3.4 GHz (verified through firmware),
> here is a measurement and calculation of operating frequency:
> 
> t0: ref=899127636, del=3012458473
> t1: ref=899129626, del=3012466509
> perf=40

In this case, the target cpu is mostly idle but not fully idle during the
sampling window since the counter grows a little bit.
Perhaps some interrupts happen to run on the cpu shortly.

Thus, the actual issue is the accuracy of frequency sampling becomes poor
when the delta of counters are too small to obtain a reliable accuracy.

Would it be more sensible to put a minimum threshold of the delta of
counters when sampling the frequency?

If the threshold is not met, we can go to the existing out_invalid_counters
path.  Then we don't have to export idle_cpu() either, and BTW, that ABI
doesn't seem to be synchronous at all, i.e. the cpu might be busy when we
check and then become idle when sampling.

> 
> For reference, when we measure the same CPU with stress-ng running, we have
> a more accurate result:
> t0: ref=30751756418, del=104490567689
> t1: ref=30751760628, del=104490582296
> perf=34
> 
> (t0 and t1 are 2 microseconds apart)
> 
> In the above, the prescribed method[1] of calculating frequency from CPPC
> counters was used.
> 
> The follow-on effect is that the inaccurate frequency is stashed in the
> cpufreq policy struct when the CPU is brought online. Since CPUs are mostly
> idle when they are brought online, this means cpufreq has an inaccurate
> view of the programmed clock rate.
> 
> Consequently, if userspace tries to actually set the frequency to the
> previously erroneous rate (4 GHz in the above example), cpufreq returns
> early without calling in to the CPPC driver to send the relevant PCC
> command; it thinks the CPU is already at that frequency.
> 
> Update the CPPC get_rate() code to skip sampling counters if we know a CPU
> is idle, and go directly to the fallback response of returning the
> “desired” frequency. The code intends to do that anyway if the counters
> happen to return an “idle” reading.
> 
> [1] https://docs.kernel.org/admin-guide/acpi/cppc_sysfs.html#computing-average-delivered-performance
> 
> Signed-off-by: Prashant Malani <pmalani@...gle.com>
> ---
> 
> Changes in v2:
> - Add sched.h header for usage when compiled as module.
> 
>  drivers/cpufreq/cppc_cpufreq.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index b7c688a5659c..5ed04774e569 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -18,6 +18,7 @@
>  #include <linux/cpufreq.h>
>  #include <linux/irq_work.h>
>  #include <linux/kthread.h>
> +#include <linux/sched.h>
>  #include <linux/time.h>
>  #include <linux/vmalloc.h>
>  #include <uapi/linux/sched/types.h>
> @@ -753,6 +754,10 @@ static unsigned int cppc_cpufreq_get_rate(unsigned int cpu)
>  
>  	cpufreq_cpu_put(policy);
>  
> +	/* Idle CPUs have unreliable counters, so skip to the end. */
> +	if (idle_cpu(cpu))
> +		goto out_invalid_counters;
> +
>  	ret = cppc_get_perf_ctrs_sample(cpu, &fb_ctrs_t0, &fb_ctrs_t1);
>  	if (ret) {
>  		if (ret == -EFAULT)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ