lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <561a9474-7be6-4c8a-8a5d-40efb186b3d2@huawei.com>
Date: Wed, 13 Aug 2025 18:17:54 +0800
From: "zhenglifeng (A)" <zhenglifeng1@...wei.com>
To: Beata Michalska <beata.michalska@....com>
CC: <catalin.marinas@....com>, <will@...nel.org>, <rafael@...nel.org>,
	<viresh.kumar@...aro.org>, <sudeep.holla@....com>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-pm@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linuxarm@...wei.com>,
	<jonathan.cameron@...wei.com>, <vincent.guittot@...aro.org>,
	<yangyicong@...ilicon.com>, <zhanjie9@...ilicon.com>, <lihuisong@...wei.com>,
	<yubowen8@...wei.com>, <linhongye@...artners.com>
Subject: Re: [PATCH v3 3/3] arm64: topology: Setup AMU FIE for online CPUs
 only

On 2025/8/6 17:55, Beata Michalska wrote:

> On Tue, Aug 05, 2025 at 05:33:30PM +0800, Lifeng Zheng wrote:
>> When boot with maxcpu=1 restrict, and LPI(Low Power Idle States) is on,
>> only CPU0 will go online. The support AMU flag of CPU0 will be set but the
>> flags of other CPUs will not. This will cause AMU FIE set up fail for CPU0
>> when it shares a cpufreq policy with other CPU(s). After that, when other
>> CPUs are finally online and the support AMU flags of them are set, they'll
>> never have a chance to set up AMU FIE, even though they're eligible.
>>
>> To solve this problem, the process of setting up AMU FIE needs to be
>> modified as follows:
>>
>> 1. Set up AMU FIE only for the online CPUs.
>>
>> 2. Try to set up AMU FIE each time a CPU goes online and do the
>> freq_counters_valid() check. If this check fails, clear scale freq source
>> of all the CPUs related to the same policy, in case they use different
>> source of the freq scale.
>>
>> Signed-off-by: Lifeng Zheng <zhenglifeng1@...wei.com>
>> ---
>>  arch/arm64/kernel/topology.c | 54 ++++++++++++++++++++++++++++++++++--
>>  1 file changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index 9317a618bb87..b68621b3c071 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -385,7 +385,7 @@ static int init_amu_fie_callback(struct notifier_block *nb, unsigned long val,
>>  	struct cpufreq_policy *policy = data;
>>  
>>  	if (val == CPUFREQ_CREATE_POLICY)
>> -		amu_fie_setup(policy->related_cpus);
>> +		amu_fie_setup(policy->cpus);
> I think my previous comment still stands.
> This will indeed set the AMU FIE support for online cpus.
> Still, on each frequency change, arch_set_freq_scale will be called with
> `related_cpus`, and that mask will be used to verify support for AMU counters,
> and it will report there is none as 'related_cpus' won't be a subset of
> `scale_freq_counters_mask`. As a consequence, new scale will be set, as seen by
> the cpufreq. Now this will be corrected on next tick but it might cause
> disruptions. So this change should also be applied to cpufreq - if feasible, or
> at least be proven not to be an issue. Unless I am missing smth.

I know what you mean now. Yes, I think you are right, this change should
also be applied to cpufreq too. Thanks!

>>  
>>  	/*
>>  	 * We don't need to handle CPUFREQ_REMOVE_POLICY event as the AMU
>> @@ -404,10 +404,60 @@ static struct notifier_block init_amu_fie_notifier = {
>>  	.notifier_call = init_amu_fie_callback,
>>  };
>>  
>> +static int cpuhp_topology_online(unsigned int cpu)
>> +{
>> +	struct cpufreq_policy *policy = cpufreq_cpu_get_raw_no_check(cpu);
>> +
>> +	/*
>> +	 * If the online CPUs are not all AMU FIE CPUs or the new one is already
>> +	 * an AMU FIE one, no need to set it.
>> +	 */
>> +	if (!policy || !cpumask_available(amu_fie_cpus) ||
>> +	    !cpumask_subset(policy->cpus, amu_fie_cpus) ||
>> +	    cpumask_test_cpu(cpu, amu_fie_cpus))
>> +		return 0;
> This is getting rather cumbersome as the CPU that is coming online might belong
> to a policy that is yet to be created. Both AMU FIE support, as well as cpufreq,
> rely on dynamic hp state so, in theory, we cannot be certain that the cpufreq
> callback will be fired first (although that seems to be the case).
> If that does not happen, the policy will not exist, and as such given CPU
> will not use AMUs for FIE. The problem might be hypothetical but it at least
> deservers a comment I think.

Actually, this callback will always be fired before the cpufreq one,
because init_amu_fie() is called before any cpufreq driver init func (It
has to be, otherwise the init_amu_fie_notifier cannot be registered before
it is needed.). And the callback that is setup first will be called first
when online if rely on dynamic hp state. So in your hypothetical scenario,
yes, the policy will not exist and this funcion will do nothing. But that's
OK because the notifier callback will do what should be done when the
policy is being created.

> Second problem is cpumask_available use: this might be the very fist CPU that
> might potentially rely on AMUs for frequency invariance so that mask may not be
> available yet. That does not mean AMUs setup should be skipped. Not just yet,
> at least. Again more hypothetical.

Same, things will be done in the notifier callback when the policy is being
created.

> BTW, there should be `amu_fie_cpu_supported'.

Sorry, I can't see why it is needed. Could you explained further?

>> +
>> +	/*
>> +	 * If the new online CPU cannot pass this check, all the CPUs related to
>> +	 * the same policy should be clear from amu_fie_cpus mask, otherwise they
>> +	 * may use different source of the freq scale.
>> +	 */
>> +	if (!freq_counters_valid(cpu)) {
>> +		topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_ARCH,
>> +						 policy->related_cpus);
>> +		cpumask_andnot(amu_fie_cpus, amu_fie_cpus, policy->related_cpus);
> I think it might deserve a warning as this probably should not happen.

Yes, makes sense. Thanks!

> 
> ---
> BR
> Beata
>> +		return 0;
>> +	}
>> +
>> +	cpumask_set_cpu(cpu, amu_fie_cpus);
>> +
>> +	topology_set_scale_freq_source(&amu_sfd, cpumask_of(cpu));
>> +
>> +	pr_debug("CPU[%u]: counter will be used for FIE.", cpu);
>> +
>> +	return 0;
>> +}
>> +
>>  static int __init init_amu_fie(void)
>>  {
>> -	return cpufreq_register_notifier(&init_amu_fie_notifier,
>> +	int ret;
>> +
>> +	ret = cpufreq_register_notifier(&init_amu_fie_notifier,
>>  					CPUFREQ_POLICY_NOTIFIER);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
>> +					"arm64/topology:online",
>> +					cpuhp_topology_online,
>> +					NULL);
>> +	if (ret < 0) {
>> +		cpufreq_unregister_notifier(&init_amu_fie_notifier,
>> +					    CPUFREQ_POLICY_NOTIFIER);
>> +		return ret;
>> +	}
>> +
>> +	return 0;
>>  }
>>  core_initcall(init_amu_fie);
>>  
>> -- 
>> 2.33.0
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ