lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <ZtrPTcEtynUUjBub@arm.com>
Date: Fri, 6 Sep 2024 11:45:49 +0200
From: Beata Michalska <beata.michalska@....com>
To: Jie Zhan <zhanjie9@...ilicon.com>
Cc: Catalin Marinas <catalin.marinas@....com>, linux-kernel@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org, ionela.voinescu@....com,
	sudeep.holla@....com, will@...nel.org, vincent.guittot@...aro.org,
	vanshikonda@...amperecomputing.com, sumitg@...dia.com,
	yang@...amperecomputing.com, lihuisong@...wei.com,
	viresh.kumar@...aro.org, rafael@...nel.org
Subject: Re: [PATCH v6 0/4] Add support for AArch64 AMUv1-based
 arch_freq_get_on_cpu

On Tue, Aug 27, 2024 at 08:56:28PM +0800, Jie Zhan wrote:
> On 26/08/2024 15:24, Beata Michalska wrote:
> 
> ...
> 
> > > I've recently tested this patchset on a Kunpeng system.
> > > It works as expected when reading scaling_cur_freq.
> > > The frequency samples are much stabler than what cppc_cpufreq returns.
> > Thank you for giving it a spin.
> > (and apologies for late reply)
> > > A few minor things.
> > > 
> > > 1. I observed larger errors on idle cpus than busy cpus, though it's just up
> > > to 1%.
> > > Not sure if this comes from the uncertain time interval between the last
> > > tick and entering idle.
> > > The shorter averaging interval, the larger error, I supposed.
> > All right - will look into it.
> > Just for my benefit: that diff is strictly between arch_freq_avg_get_on_cpu
> > and cpufreq_driver->get(policy->cpu) ?
> 
> I can't say whether it's "strictly" between them or not because driver->get()
> shows a fluctuating value.
> On my platform, cppc_cpufreq's driver->get() sometimes shows large errors on
> busy cpus (as reported by recent emails), but quite accurate on idle cpus (<1%).
> 
> With this patch, the "error" on idle cpus as mentioned above is typically <1%,
> hence better in general.
Ah, that's great then - I must have misunderstood your previous comment.
Apologies for that.
> 
> > > 2. In the current implementation, the resolution of frequency would be:
> > > max_freq_khz / SCHED_CAPACITY_SCALE
> > > This looks a bit unnecessary to me.
> > > 
> > > It's supposed to get a better resolution if we can do this in
> > > arch_freq_get_on_cpu():
> > > 
> > > freq = delta_cycle_cnt * max_freq_khz / delta_const_cnt
> > > 
> > > which may require caching both current and previous sets of counts in the
> > > per-cpu struct amu_cntr_sample.
> > > 
> > arch_freq_get_on_cpu relies on the frequency scale factor to derive the average
> > frequency. The scale factor is being calculated based on the deltas you have
> > mentioned and arch_max_freq_scale which uses SCHED_CAPACITY_SCALE*2 factor to
> > accommodate for rather low reference frequencies. arch_freq_get_on_cpu just does
> > somewhat reverse computation to that.
> 
> Yeah I understood this.
> 
> arch_freq_get_on_cpu() now returns:
> freq = (arch_scale_freq_capacity() * arch_scale_freq_ref()) >> SCHED_CAPACITY_SHIFT
> 
> The frequency resolution is (arch_scale_freq_ref() >> SCHED_CAPACITY_SHIFT), which
> is equivalent to max_freq_khz / SCHED_CAPACITY_SCALE.
> 
> If we can directly do
> freq = delta_cycle_cnt * ref_freq_khz / delta_const_cnt
> in arch_freq_get_on_cpu(), it's supposed to have a 1KHz resolution.
> (sorry for the wrong multiplier in the last reply)
> 
> Just similar to what's done in [1].
> 
> It could be more worthwhile to have a good resolution than utilising the
> arch_topology framework?
I guess the question would be whether the precision uplift justifies
the change ? In both cases, provided frequency value will carry over potential
error due to the nature of how it is being obtained. Furthermore, it is still
an average frequency so I am not really convinced that trading here for
a fraction of precision would bring any real value. Note that, the difference
between the two will fluctuate as well. If there is indeed a real need for
getting that extra precision* it would be good to see some actual numbers ?

---
BR
Beata

> 
> [1]https://lore.kernel.org/all/20240229162520.970986-2-vanshikonda@os.amperecomputing.com/
> 
> > 
> > ---
> > BR
> > Beata
> > > Kind regards,
> > > Jie
> > > 

Powered by blists - more mailing lists