lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <ZtrPTcEtynUUjBub@arm.com> Date: Fri, 6 Sep 2024 11:45:49 +0200 From: Beata Michalska <beata.michalska@....com> To: Jie Zhan <zhanjie9@...ilicon.com> Cc: Catalin Marinas <catalin.marinas@....com>, linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, ionela.voinescu@....com, sudeep.holla@....com, will@...nel.org, vincent.guittot@...aro.org, vanshikonda@...amperecomputing.com, sumitg@...dia.com, yang@...amperecomputing.com, lihuisong@...wei.com, viresh.kumar@...aro.org, rafael@...nel.org Subject: Re: [PATCH v6 0/4] Add support for AArch64 AMUv1-based arch_freq_get_on_cpu On Tue, Aug 27, 2024 at 08:56:28PM +0800, Jie Zhan wrote: > On 26/08/2024 15:24, Beata Michalska wrote: > > ... > > > > I've recently tested this patchset on a Kunpeng system. > > > It works as expected when reading scaling_cur_freq. > > > The frequency samples are much stabler than what cppc_cpufreq returns. > > Thank you for giving it a spin. > > (and apologies for late reply) > > > A few minor things. > > > > > > 1. I observed larger errors on idle cpus than busy cpus, though it's just up > > > to 1%. > > > Not sure if this comes from the uncertain time interval between the last > > > tick and entering idle. > > > The shorter averaging interval, the larger error, I supposed. > > All right - will look into it. > > Just for my benefit: that diff is strictly between arch_freq_avg_get_on_cpu > > and cpufreq_driver->get(policy->cpu) ? > > I can't say whether it's "strictly" between them or not because driver->get() > shows a fluctuating value. > On my platform, cppc_cpufreq's driver->get() sometimes shows large errors on > busy cpus (as reported by recent emails), but quite accurate on idle cpus (<1%). > > With this patch, the "error" on idle cpus as mentioned above is typically <1%, > hence better in general. Ah, that's great then - I must have misunderstood your previous comment. Apologies for that. > > > > 2. In the current implementation, the resolution of frequency would be: > > > max_freq_khz / SCHED_CAPACITY_SCALE > > > This looks a bit unnecessary to me. > > > > > > It's supposed to get a better resolution if we can do this in > > > arch_freq_get_on_cpu(): > > > > > > freq = delta_cycle_cnt * max_freq_khz / delta_const_cnt > > > > > > which may require caching both current and previous sets of counts in the > > > per-cpu struct amu_cntr_sample. > > > > > arch_freq_get_on_cpu relies on the frequency scale factor to derive the average > > frequency. The scale factor is being calculated based on the deltas you have > > mentioned and arch_max_freq_scale which uses SCHED_CAPACITY_SCALE*2 factor to > > accommodate for rather low reference frequencies. arch_freq_get_on_cpu just does > > somewhat reverse computation to that. > > Yeah I understood this. > > arch_freq_get_on_cpu() now returns: > freq = (arch_scale_freq_capacity() * arch_scale_freq_ref()) >> SCHED_CAPACITY_SHIFT > > The frequency resolution is (arch_scale_freq_ref() >> SCHED_CAPACITY_SHIFT), which > is equivalent to max_freq_khz / SCHED_CAPACITY_SCALE. > > If we can directly do > freq = delta_cycle_cnt * ref_freq_khz / delta_const_cnt > in arch_freq_get_on_cpu(), it's supposed to have a 1KHz resolution. > (sorry for the wrong multiplier in the last reply) > > Just similar to what's done in [1]. > > It could be more worthwhile to have a good resolution than utilising the > arch_topology framework? I guess the question would be whether the precision uplift justifies the change ? In both cases, provided frequency value will carry over potential error due to the nature of how it is being obtained. Furthermore, it is still an average frequency so I am not really convinced that trading here for a fraction of precision would bring any real value. Note that, the difference between the two will fluctuate as well. If there is indeed a real need for getting that extra precision* it would be good to see some actual numbers ? --- BR Beata > > [1]https://lore.kernel.org/all/20240229162520.970986-2-vanshikonda@os.amperecomputing.com/ > > > > > --- > > BR > > Beata > > > Kind regards, > > > Jie > > >
Powered by blists - more mailing lists