lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250626093018.106265-1-dietmar.eggemann@arm.com>
Date: Thu, 26 Jun 2025 11:30:18 +0200
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: "Rafael J . Wysocki" <rafael@...nel.org>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Sudeep Holla <sudeep.holla@....com>,
	Christian Loehle <christian.loehle@....com>
Cc: linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Robin Murphy <robin.murphy@....com>,
	Beata Michalska <beata.michalska@....com>,
	zhenglifeng1@...wei.com
Subject: [RFC PATCH] cpufreq,base/arch_topology: Calculate cpu_capacity according to boost

I noticed on my Arm64 big.Little platform (Juno-r0, scmi-cpufreq) that
the cpu_scale values (/sys/devices/system/cpu/cpu*/cpu_capacity) of the
little CPU changed in v6.14 from 446 to 505. I bisected and found that
commit dd016f379ebc ("cpufreq: Introduce a more generic way to set
default per-policy boost flag") (1) introduced this change.
Juno's scmi FW marks the 2 topmost OPPs of each CPUfreq policy (policy0:
775000 850000, policy1: 950000 1100000) as boost OPPs.

The reason is that the 'policy->boost_enabled = true' is now done after
'cpufreq_table_validate_and_sort() -> cpufreq_frequency_table_cpuinfo()'
in cpufreq_online() so that 'policy->cpuinfo.max_freq' is set to the
'highest non-boost' instead of the 'highest boost' frequency.

This is before the CPUFREQ_CREATE_POLICY notifier is fired in
cpufreq_online() to which the cpu_capacity setup code in
[drivers/base/arch_topology.c] has registered.

Its notifier_call init_cpu_capacity_callback() uses
'policy->cpuinfo.max_freq' to set the per-cpu
capacity_freq_ref so that the cpu_capacity can be calculated as:

cpu_capacity = raw_cpu_capacity (2) * capacity_freq_ref /
				      'max system-wide cpu frequency'

(2) Juno's little CPU has 'capacity-dmips-mhz = <578>'.

So before (1) for a little CPU:

cpu_capacity = 578 * 850000 / 1100000 = 446

and after:

cpu_capacity = 578 * 700000 / 800000 = 505

This issue can also be seen on Arm64 boards with cpufreq-dt drivers
using the 'turbo-mode' dt property for boosted OPPs.

What's actually needed IMHO is to calculate cpu_capacity according to
the boost value. I.e.:

(a) The infrastructure to adjust cpu_capacity in arch_topology.c has to
    be kept alive after boot.

(b) There has to be some kind of notification from cpufreq.c to
    arch_topology.c about the toggling of boost. I'm abusing
    CPUFREQ_CREATE_POLICY for this right now. Could we perhaps add a
    CPUFREQ_MOD_POLICY for this?

(c) Allow unconditional set of policy->cpuinfo.max_freq in case boost
    is set to 0 in cpufreq_frequency_table_cpuinfo().
    This currently clashes with the commented feature that in case the
    driver has set a higher value it should stay untouched.

Tested on Arm64 Juno (scmi-cpufreq) and Hikey 960 (cpufreq-dt +
added 'turbo-mode' to the topmost OPPs in dts file).

This is probably related what Christian Loehle tried to address in
https://lkml.kernel.org/r/3cc5b83b-f81c-4bd7-b7ff-4d02db4e25d8@arm.com .

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@....com>
---
 drivers/base/arch_topology.c | 11 -----------
 drivers/cpufreq/cpufreq.c    |  3 +++
 drivers/cpufreq/freq_table.c |  8 +-------
 3 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1c3221ff1d1f..0a3916dc9644 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -378,8 +378,6 @@ void acpi_processor_init_invariance_cppc(void)
 
 #ifdef CONFIG_CPU_FREQ
 static cpumask_var_t cpus_to_visit;
-static void parsing_done_workfn(struct work_struct *work);
-static DECLARE_WORK(parsing_done_work, parsing_done_workfn);
 
 static int
 init_cpu_capacity_callback(struct notifier_block *nb,
@@ -408,10 +406,8 @@ init_cpu_capacity_callback(struct notifier_block *nb,
 		if (raw_capacity) {
 			topology_normalize_cpu_scale();
 			schedule_work(&update_topology_flags_work);
-			free_raw_capacity();
 		}
 		pr_debug("cpu_capacity: parsing done\n");
-		schedule_work(&parsing_done_work);
 	}
 
 	return 0;
@@ -447,13 +443,6 @@ static int __init register_cpufreq_notifier(void)
 }
 core_initcall(register_cpufreq_notifier);
 
-static void parsing_done_workfn(struct work_struct *work)
-{
-	cpufreq_unregister_notifier(&init_cpu_capacity_notifier,
-					 CPUFREQ_POLICY_NOTIFIER);
-	free_cpumask_var(cpus_to_visit);
-}
-
 #else
 core_initcall(free_raw_capacity);
 #endif
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d7426e1d8bdd..6fdfcb6815d7 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2798,6 +2798,9 @@ int cpufreq_boost_set_sw(struct cpufreq_policy *policy, int state)
 		return ret;
 	}
 
+	blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
+				     CPUFREQ_CREATE_POLICY, policy);
+
 	ret = freq_qos_update_request(policy->max_freq_req, policy->max);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
index 35de513af6c9..06068a12ec53 100644
--- a/drivers/cpufreq/freq_table.c
+++ b/drivers/cpufreq/freq_table.c
@@ -51,13 +51,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
 	}
 
 	policy->min = policy->cpuinfo.min_freq = min_freq;
-	policy->max = max_freq;
-	/*
-	 * If the driver has set its own cpuinfo.max_freq above max_freq, leave
-	 * it as is.
-	 */
-	if (policy->cpuinfo.max_freq < max_freq)
-		policy->max = policy->cpuinfo.max_freq = max_freq;
+	policy->max = policy->cpuinfo.max_freq = max_freq;
 
 	if (policy->min == ~0)
 		return -EINVAL;
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ