lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7eozim2xnepacnnkzxlbx34hib4otycnbn4dqymfziqou5lw5u@5xzpv3t7sxo3>
Date: Thu, 22 Feb 2024 11:55:51 -0800
From: Vanshidhar Konda <vanshikonda@...amperecomputing.com>
To: Beata Michalska <beata.michalska@....com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-pm@...r.kernel.org, sumitg@...dia.com, sudeep.holla@....covm, will@...nel.org, 
	catalin.marinas@....com, viresh.kumar@...aro.org, rafael@...nel.org, 
	ionela.voinescu@....com, yang@...amperecomputing.com, linux-tegra@...r.kernel.org, 
	Sudeep Holla <sudeep.holla@....com>, lihuisong@...wei.com
Subject: Re: [PATCH v2 1/2] arm64: Provide an AMU-based version of
 arch_freq_get_on_cpu

Hello Beata,

I tested this patch based on the discussion in [1] on an AmpereOne
system when the system was mostly idle. The results below are when only
I applied the first patch in this series to the kernel. I noticed that
the frequency reported through scaling_cur_freq vs cpuinfo_cur_freq is
quite different for some cores. When the cores are loaded using
stress-ng, the scaling_cur_freq and cpuinfo_cur_freq values are quite
similar.

Applying the second patch in this series causes the difference between
scaling_cur_freq and cpuinfo_cur_freq to disappear, but calculating the
core frequency based on the feedback_ctrs shows that the value is
incorrect for both.

The kernel I compiled for testing is based on the Fedora 39 config file.
These configs seem relevant to the discussion:

CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
CONFIG_CONTEXT_TRACKING_USER=y
# CONFIG_CONTEXT_TRACKING_USER_FORCE is not set
CONFIG_NO_HZ=y

[1]: https://lore.kernel.org/linux-arm-kernel/20231212072617.14756-1-lihuisong@huawei.com/

Results:
cpu_num scaling_cur_freq  cpuinfo_cur_freq
[11]:	  1874560	          1000000
[12]:	  2056158	          1385000
[13]:	  1974146	          1000000
[21]:	  1587518	          1000000
[23]:	  1593376           1000000
..
.. skipping similar results for ~50 other cores for brevity
..
nohz_full=113-118
[113]:	1874560	          1000000
[114]:	1968288	          1000000
[115]:	1962430	          1000000
[116]:	1871631	          1000000
[117]:	1877489	          1000000
[118]:	1877489	          1000000
isolcpus=119-127
[119]:	2999296	          1000000
[120]:	2999296	          1000000
[121]:	2999296	          1000000
[125]:	2999296	          1000000
[126]:	2999296	          1000000
[127]:	2999296	          1000000

Thanks,
Vanshi

On Mon, Nov 27, 2023 at 04:08:37PM +0000, Beata Michalska wrote:
>With the Frequency Invariance Engine (FIE) being already wired up with
>sched tick and making use of relevant (core counter and constant
>counter) AMU counters, getting the current frequency for a given CPU
>on supported platforms, can be achieved by utilizing the frequency scale
>factor which reflects an average CPU frequency for the last tick period
>length.
>
>Suggested-by: Ionela Voinescu <ionela.voinescu@....com>
>Signed-off-by: Beata Michalska <beata.michalska@....com>
>Reviewed-by: Sudeep Holla <sudeep.holla@....com>
>---
>
>Notes:
>    Due to [1], if merged, there might be a need to modify the patch to
>    accommodate changes [1] introduces:
>
>    	freq = cpufreq_get_hw_max_freq(cpu) >> SCHED_CAPACITY_SHIFT
>    	to
>    	freq = per_cpu(capacity_freq_ref, cpu); >> SCHED_CAPACITY_SHIFT
>    [1]
>    https://lore.kernel.org/linux-arm-kernel/20231121154349.GA1938@willie-the-truck/T/#mcb018d076dbce6f60ed2779634a9b6ffe622641e
>
> arch/arm64/kernel/topology.c | 39 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 39 insertions(+)
>
>diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>index 615c1a20129f..ae2445f6e7da 100644
>--- a/arch/arm64/kernel/topology.c
>+++ b/arch/arm64/kernel/topology.c
>@@ -17,6 +17,7 @@
> #include <linux/cpufreq.h>
> #include <linux/init.h>
> #include <linux/percpu.h>
>+#include <linux/sched/isolation.h>
>
> #include <asm/cpu.h>
> #include <asm/cputype.h>
>@@ -186,6 +187,44 @@ static void amu_scale_freq_tick(void)
> 	this_cpu_write(arch_freq_scale, (unsigned long)scale);
> }
>
>+unsigned int arch_freq_get_on_cpu(int cpu)
>+{
>+	unsigned int freq;
>+	u64 scale;
>+
>+	if (!cpumask_test_cpu(cpu, amu_fie_cpus))
>+		return 0;
>+
>+	/*
>+	 * For those CPUs that are in full dynticks mode, try an alternative
>+	 * source for the counters (and thus freq scale),
>+	 * if available for given policy
>+	 */
>+	if (!housekeeping_cpu(cpu, HK_TYPE_TICK)) {
>+		struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
>+		int ref_cpu = nr_cpu_ids;
>+
>+		if (cpumask_intersects(housekeeping_cpumask(HK_TYPE_TICK),
>+				       policy->cpus))
>+			ref_cpu = cpumask_nth_and(cpu, policy->cpus,
>+						  housekeeping_cpumask(HK_TYPE_TICK));
>+		cpufreq_cpu_put(policy);
>+		if (ref_cpu >= nr_cpu_ids)
>+			return 0;
>+		cpu = ref_cpu;
>+	}
>+
>+	/*
>+	 * Reversed computation to the one used to determine
>+	 * the arch_freq_scale value
>+	 * (see amu_scale_freq_tick for details)
>+	 */
>+	scale = per_cpu(arch_freq_scale, cpu);
>+	freq = cpufreq_get_hw_max_freq(cpu) >> SCHED_CAPACITY_SHIFT;
>+	freq *= scale;
>+	return freq;
>+}
>+
> static struct scale_freq_data amu_sfd = {
> 	.source = SCALE_FREQ_SOURCE_ARCH,
> 	.set_freq_scale = amu_scale_freq_tick,
>-- 
>2.25.1
>
>
>_______________________________________________
>linux-arm-kernel mailing list
>linux-arm-kernel@...ts.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ