lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAFivqm+Xi9FYtzPmT0QkAUxC2Kx_AkrH2NuQE_sVnJVuo48ypA@mail.gmail.com>
Date: Wed, 20 Aug 2025 14:25:16 -0700
From: Prashant Malani <pmalani@...gle.com>
To: Beata Michalska <beata.michalska@....com>
Cc: Yang Shi <yang@...amperecomputing.com>, open list <linux-kernel@...r.kernel.org>, 
	"open list:CPU FREQUENCY SCALING FRAMEWORK" <linux-pm@...r.kernel.org>, "Rafael J. Wysocki" <rafael@...nel.org>, 
	Viresh Kumar <viresh.kumar@...aro.org>, Catalin Marinas <catalin.marinas@....com>, 
	Ionela Voinescu <Ionela.Voinescu@....com>
Subject: Re: [PATCH] cpufreq: CPPC: Increase delay between perf counter reads

Hi Beata,

On Wed, 20 Aug 2025 at 13:49, Beata Michalska <beata.michalska@....com> wrote:
>
> Kinda working on that one.

OK. I'm eager to see what the solution is!

> >
> > Outside of that, I can't think of another mitigation beyond adding delay to make
> > the time deltas not matter so much.
> I'm not entirely sure what 'so much' means in this context.
> How one would quantify whether the added delay is actually mitigating the issue?
>

I alluded to it in the commit description, but here is the my basic
numerical analysis:
The effective timestamps for the 4 readings right now are:
Timestamp t0: del0
Timestamp t0 + m: ref0
(Time delay X us)
Timestamp t1: del1
Timestamp t1 + n: ref1

Timestamp t1 = t0 + m + X

The perf calculation is:
Per = del1 - del0 / ref1 - ref0
      = Del_counter_diff_over_time(t1 - t0) /
ref_counter_diff_over_time(t1 + n - (t0 + m))
      = Del_counter_diff_over time(t0 + m + X - t0) /
ref_counter_diff_over_time((t0 + m + X + n - t0 - m)
      = Del_counter_diff_over_time(m + X) / ref_counter_diff_over_time(n + X)

If X >> (m,n) this becomes:
      = Del_counter_diff_over_time(X) / ref_counter_diff_over_time(X)
which is what the actual calculation is supposed to be.

if X ~ (m, N) (which is what the case is right now), the calculation
becomes erratic.

> >
> > Perhaps ARM should introduce a "snapshot" feature which takes the snapshot
> > of the AMU counters on a register write; IAC that's outside the scope
> What do you mean by register write ?

I mean that we should have a snapshot register:
SYS_AMEVCNTR0_SNAPSHOT_EL0

writing to this will save the current values of SYS_AMEVCNTR0_CORE_EL0 and
SYS_AMEVCNTR0_CONST_EL0 into two shadow registers (say
SYS_AMEVCNTR0_CORE_SNAPSHOT_EL0
and SYS_AMEVCNTR0_CONST_SNAPSHOT_EL0)

That way the OS specifics in how the registers are read don't matter.

Of course, I'm not a H/W architect so this proposal is very rough, but hopefully
helps illustrate my idea.

> > of this discussion.
> >
> > Could you kindly explain why adding the udelay here is discouraged?
>
> Would you mind clarifying how the specific value of 200 µs was determined?
> Was it carefully derived from measurements across multiple platforms and
> workloads, or simply observed to “work” on one relatively stable setup?
>
> Understanding the basis for choosing that delay will help put the discussion
> into the right context.

TBH, I just experimented with values on our system and observed the readings of
cores running stress-ng. I tried 100us and that still gave variability
(values greater than
the theoretical max frequency). It's possible that the "optimal value"
is somewhere
between these two.

There have been other observations on this topic [1], that suggest
that even 100us
improves the error rate significantly from what it is with 2us.

BR,

-Prashant

[1] https://lore.kernel.org/all/20230328193846.8757-1-yang@os.amperecomputing.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ