[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aJ7B2labaxza9duY@google.com>
Date: Fri, 15 Aug 2025 05:12:58 +0000
From: Prashant Malani <pmalani@...gle.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Viresh Kumar <viresh.kumar@...aro.org>,
Beata Michalska <beata.michalska@....com>,
Jie Zhan <zhanjie9@...ilicon.com>,
Ionela Voinescu <ionela.voinescu@....com>,
Ben Segall <bsegall@...gle.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
open list <linux-kernel@...r.kernel.org>,
"open list:CPU FREQUENCY SCALING FRAMEWORK" <linux-pm@...r.kernel.org>,
Mel Gorman <mgorman@...e.de>, Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Valentin Schneider <vschneid@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
z00813676 <zhenglifeng1@...wei.com>, sudeep.holla@....com
Subject: Re: [PATCH v2 2/2] cpufreq: CPPC: Dont read counters for idle CPUs
Thanks a lot for taking a look at this, Rafael.
On Aug 14 13:48, Rafael J. Wysocki wrote:
>
> First off, AFAICS, using idle_cpu() for reliable detection of CPU
> idleness in a sysfs attribute code path would be at least
> questionable, if not outright invalid. By the time you have got a
> result from it, there's nothing to prevent the CPU in question from
> going idle or waking up from idle.
This is a heuristic-based optimization. The observation is that when
the CPU is idle (or near-idle/lightly loaded, since FFH actually wakes
up an idle CPU), the AMU counters as read from the kernel are unreliable.
It is fine if the CPU wakes up from idle immediately after the check.
In that case, we'd return the desired frequency (via PCC reg read), which
is what the frequency would be anyway (if the AMU measurement was
actually taken).
In a sense, the assumption here is no worse than what is there at
present; currently the samples are taken across 2us, and (theoretically)
if the difference between them is 0, we take the fallback path. There is
nothing to prevent the CPU from waking up immediately after that 2us
sample period.
> Moreover, the fact that the given
> CPU is idle from the scheduler perspective doesn't actually mean that
> it is in an idle state and so it has no bearing on whether or not its
> performance counters can be accessed etc.
The idle check isn't meant to guard against accessing the counters.
AFAICT it is perfectly valid to access the counters even when the CPU is
actually idle.
>
> The way x86 deals with this problem is to snapshot the counters in
> question periodically (actually, in scheduler ticks) and fall back to
> cpu_khz if the interval between the two consecutive updates is too
> large (see https://elixir.bootlin.com/linux/v6.16/source/arch/x86/kernel/cpu/aperfmperf.c#L502).
> I think that this is the only reliable way to handle it, but I may be
> mistaken.
This is interesting. I think it may not work for the CPPC case, since
the registers in question are in some cases accessed through PCC reads
which require semaphores. I think it would be untenable to do that in
the tick handler (but I may be mistaken here). It's easier on x86
since those are always just MSRs.
We could probably do it for the FFH case, but then we're bifurcating
the computation method and IMO that's not worth the hassle.
Perhaps some of the ARM experts here can think of ways to do this that
I haven't considered.
Best regards,
-Prashant
Powered by blists - more mailing lists