[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aKzGlD7ZDIS4XMsF@google.com>
Date: Mon, 25 Aug 2025 20:24:52 +0000
From: Prashant Malani <pmalani@...gle.com>
To: Beata Michalska <beata.michalska@....com>
Cc: Yang Shi <yang@...amperecomputing.com>,
open list <linux-kernel@...r.kernel.org>,
"open list:CPU FREQUENCY SCALING FRAMEWORK" <linux-pm@...r.kernel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
Catalin Marinas <catalin.marinas@....com>,
Ionela Voinescu <Ionela.Voinescu@....com>
Subject: Re: [PATCH] cpufreq: CPPC: Increase delay between perf counter reads
On Aug 25 16:52, Beata Michalska wrote:
> On Wed, Aug 20, 2025 at 02:25:16PM -0700, Prashant Malani wrote:
> > Hi Beata,
> >
> > On Wed, 20 Aug 2025 at 13:49, Beata Michalska <beata.michalska@....com> wrote:
> > >
> > > Kinda working on that one.
> >
> > OK. I'm eager to see what the solution is!
> >
> > > >
> > > > Outside of that, I can't think of another mitigation beyond adding delay to make
> > > > the time deltas not matter so much.
> > > I'm not entirely sure what 'so much' means in this context.
> > > How one would quantify whether the added delay is actually mitigating the issue?
> > >
> >
> > I alluded to it in the commit description, but here is the my basic
> > numerical analysis:
> > The effective timestamps for the 4 readings right now are:
> > Timestamp t0: del0
> > Timestamp t0 + m: ref0
> > (Time delay X us)
> > Timestamp t1: del1
> > Timestamp t1 + n: ref1
> >
> > Timestamp t1 = t0 + m + X
> >
> > The perf calculation is:
> > Per = del1 - del0 / ref1 - ref0
> > = Del_counter_diff_over_time(t1 - t0) /
> > ref_counter_diff_over_time(t1 + n - (t0 + m))
> > = Del_counter_diff_over time(t0 + m + X - t0) /
> > ref_counter_diff_over_time((t0 + m + X + n - t0 - m)
> > = Del_counter_diff_over_time(m + X) / ref_counter_diff_over_time(n + X)
> >
> > If X >> (m,n) this becomes:
> > = Del_counter_diff_over_time(X) / ref_counter_diff_over_time(X)
> > which is what the actual calculation is supposed to be.
> >
> > if X ~ (m, N) (which is what the case is right now), the calculation
> > becomes erratic.
> This is still bound by 'm' and 'n' values, as the difference between those will
> determine the error factor (with given, fixed X). If m != n, one counter delta
> is stretched more than the other, so the perf ratio no longer represents the
> same time interval. And that will vary between platforms/workloads leading to
> over/under-reporting.
What you are saying holds when m,n ~ X. But if X >> m,n, the X component
dominates. On most platforms, m and n are typically 1-2 us.
If X is anything >= 100us, it dominates the m,n component, making both
time intervals practically the same, i.e
(100 + 1) / (100 + 2) = 101 / 102 = 0.9901 ~ 1.00
> >
> > There have been other observations on this topic [1], that suggest
> > that even 100us
> > improves the error rate significantly from what it is with 2us.
> >
> > BR,
> Which is exactly why I've mentioned this approach is not really recommended,
> being bound to rather specific setup. There have been similar proposals in the
> past, all with different values of the delay which should illustrate how fragile
> solution (if any) that is.
The reports/occurences point to the fact that the current value doesn't work.
Another way of putting it is, why is 2us considered the "right"
value?
This patch was never meant to be an ideal solution, but it's better than what
is there at present. Currently, the `policy->cur` is completely unusable on CPPC,
and is cropping up in other locations in the cpufreq driver core [1]
while also breaking a userfacing ABI i.e scaling_setspeed.
I realize you're working on a solution, so if that is O(weeks) away, it
makes sense to wait; otherwise it would seem logical to mitigate the
error (it can always be reverted once the "better" solution is in
place).
Ultimately it's your call, but I'm not convinced with rationale provided
thus far.
Best regards,
-Prashant
[1] https://lore.kernel.org/linux-pm/20250823001937.2765316-1-pmalani@google.com/T/#t
Powered by blists - more mailing lists