[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <643df4d5-ad44-f84a-6b15-614876d3cd2d@hisilicon.com>
Date: Tue, 27 Aug 2024 21:03:56 +0800
From: Jie Zhan <zhanjie9@...ilicon.com>
To: Beata Michalska <beata.michalska@....com>
CC: Catalin Marinas <catalin.marinas@....com>, <linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>, <ionela.voinescu@....com>,
<sudeep.holla@....com>, <will@...nel.org>, <vincent.guittot@...aro.org>,
<vanshikonda@...amperecomputing.com>, <sumitg@...dia.com>,
<yang@...amperecomputing.com>, <lihuisong@...wei.com>,
<viresh.kumar@...aro.org>, <rafael@...nel.org>
Subject: Re: [PATCH v6 0/4] Add support for AArch64 AMUv1-based
arch_freq_get_on_cpu
On 26/08/2024 15:24, Beata Michalska wrote:
> ...
>> Hi Beata,
> Hi Jie,
>> I've recently tested this patchset on a Kunpeng system.
>> It works as expected when reading scaling_cur_freq.
>> The frequency samples are much stabler than what cppc_cpufreq returns.
> Thank you for giving it a spin.
> (and apologies for late reply)
>> A few minor things.
>>
>> 1. I observed larger errors on idle cpus than busy cpus, though it's just up
>> to 1%.
>> Not sure if this comes from the uncertain time interval between the last
>> tick and entering idle.
>> The shorter averaging interval, the larger error, I supposed.
> All right - will look into it.
> Just for my benefit: that diff is strictly between arch_freq_avg_get_on_cpu
> and cpufreq_driver->get(policy->cpu) ?
I can't say whether it's "strictly" between them or not because
driver->get()
shows a fluctuating value.
On my platform, cppc_cpufreq's driver->get() sometimes shows large errors on
busy cpus (as reported by recent emails), but quite accurate on idle
cpus (<1%).
With this patch, the "error" on idle cpus as mentioned above is
typically <1%,
hence better in general.
>> 2. In the current implementation, the resolution of frequency would be:
>> max_freq_khz / SCHED_CAPACITY_SCALE
>> This looks a bit unnecessary to me.
>>
>> It's supposed to get a better resolution if we can do this in
>> arch_freq_get_on_cpu():
>>
>> freq = delta_cycle_cnt * max_freq_khz / delta_const_cnt
>>
>> which may require caching both current and previous sets of counts in the
>> per-cpu struct amu_cntr_sample.
>>
> arch_freq_get_on_cpu relies on the frequency scale factor to derive the average
> frequency. The scale factor is being calculated based on the deltas you have
> mentioned and arch_max_freq_scale which uses SCHED_CAPACITY_SCALE*2 factor to
> accommodate for rather low reference frequencies. arch_freq_get_on_cpu just does
> somewhat reverse computation to that.
Yeah I understood this.
arch_freq_get_on_cpu() now returns:
freq = (arch_scale_freq_capacity() * arch_scale_freq_ref()) >>
SCHED_CAPACITY_SHIFT
The frequency resolution is (arch_scale_freq_ref() >>
SCHED_CAPACITY_SHIFT), which
is equivalent to max_freq_khz / SCHED_CAPACITY_SCALE.
If we can directly do
freq = delta_cycle_cnt * ref_freq_khz / delta_const_cnt
in arch_freq_get_on_cpu(), it's supposed to have a 1KHz resolution.
(sorry for the wrong multiplier in the last reply)
Just similar to what's done in [1].
It could be more worthwhile to have a good resolution than utilising the
arch_topology framework?
[1]
https://lore.kernel.org/all/20240229162520.970986-2-vanshikonda@os.amperecomputing.com/
> ---
> BR
> Beata
>> Kind regards,
>> Jie
>>
Powered by blists - more mailing lists