lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <643df4d5-ad44-f84a-6b15-614876d3cd2d@hisilicon.com>
Date: Tue, 27 Aug 2024 21:03:56 +0800
From: Jie Zhan <zhanjie9@...ilicon.com>
To: Beata Michalska <beata.michalska@....com>
CC: Catalin Marinas <catalin.marinas@....com>, <linux-kernel@...r.kernel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <ionela.voinescu@....com>,
	<sudeep.holla@....com>, <will@...nel.org>, <vincent.guittot@...aro.org>,
	<vanshikonda@...amperecomputing.com>, <sumitg@...dia.com>,
	<yang@...amperecomputing.com>, <lihuisong@...wei.com>,
	<viresh.kumar@...aro.org>, <rafael@...nel.org>
Subject: Re: [PATCH v6 0/4] Add support for AArch64 AMUv1-based
 arch_freq_get_on_cpu



On 26/08/2024 15:24, Beata Michalska wrote:
> ...
>> Hi Beata,
> Hi Jie,
>> I've recently tested this patchset on a Kunpeng system.
>> It works as expected when reading scaling_cur_freq.
>> The frequency samples are much stabler than what cppc_cpufreq returns.
> Thank you for giving it a spin.
> (and apologies for late reply)
>> A few minor things.
>>
>> 1. I observed larger errors on idle cpus than busy cpus, though it's just up
>> to 1%.
>> Not sure if this comes from the uncertain time interval between the last
>> tick and entering idle.
>> The shorter averaging interval, the larger error, I supposed.
> All right - will look into it.
> Just for my benefit: that diff is strictly between arch_freq_avg_get_on_cpu
> and cpufreq_driver->get(policy->cpu) ?
I can't say whether it's "strictly" between them or not because 
driver->get()
shows a fluctuating value.
On my platform, cppc_cpufreq's driver->get() sometimes shows large errors on
busy cpus (as reported by recent emails), but quite accurate on idle 
cpus (<1%).

With this patch, the "error" on idle cpus as mentioned above is 
typically <1%,
hence better in general.
>> 2. In the current implementation, the resolution of frequency would be:
>> max_freq_khz / SCHED_CAPACITY_SCALE
>> This looks a bit unnecessary to me.
>>
>> It's supposed to get a better resolution if we can do this in
>> arch_freq_get_on_cpu():
>>
>> freq = delta_cycle_cnt * max_freq_khz / delta_const_cnt
>>
>> which may require caching both current and previous sets of counts in the
>> per-cpu struct amu_cntr_sample.
>>
> arch_freq_get_on_cpu relies on the frequency scale factor to derive the average
> frequency. The scale factor is being calculated based on the deltas you have
> mentioned and arch_max_freq_scale which uses SCHED_CAPACITY_SCALE*2 factor to
> accommodate for rather low reference frequencies. arch_freq_get_on_cpu just does
> somewhat reverse computation to that.
Yeah I understood this.

arch_freq_get_on_cpu() now returns:
freq = (arch_scale_freq_capacity() * arch_scale_freq_ref()) >> 
SCHED_CAPACITY_SHIFT

The frequency resolution is (arch_scale_freq_ref() >> 
SCHED_CAPACITY_SHIFT), which
is equivalent to max_freq_khz / SCHED_CAPACITY_SCALE.

If we can directly do
freq = delta_cycle_cnt * ref_freq_khz / delta_const_cnt
in arch_freq_get_on_cpu(), it's supposed to have a 1KHz resolution.
(sorry for the wrong multiplier in the last reply)

Just similar to what's done in [1].

It could be more worthwhile to have a good resolution than utilising the
arch_topology framework?

[1] 
https://lore.kernel.org/all/20240229162520.970986-2-vanshikonda@os.amperecomputing.com/
> ---
> BR
> Beata
>> Kind regards,
>> Jie
>>



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ