lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <068ef765-8999-41c0-8733-1184df2adb3a@linux.ibm.com>
Date: Tue, 27 Jan 2026 23:18:59 +0530
From: Samir M <samir@...ux.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>,
        Vishal Chourasia <vishalc@...ux.ibm.com>
Cc: boqun.feng@...il.com, frederic@...nel.org, joelagnelf@...dia.com,
        josh@...htriplett.org, linux-kernel@...r.kernel.org,
        neeraj.upadhyay@...nel.org, paulmck@...nel.org, rcu@...r.kernel.org,
        rostedt@...dmis.org, srikar@...ux.ibm.com, sshegde@...ux.ibm.com,
        tglx@...utronix.de, urezki@...il.com
Subject: Re: [PATCH] cpuhp: Expedite synchronize_rcu during SMT switch


On 19/01/26 5:13 pm, Peter Zijlstra wrote:
> On Mon, Jan 19, 2026 at 04:17:40PM +0530, Vishal Chourasia wrote:
>> Expedite synchronize_rcu() during the cpuhp_smt_[enable|disable] path to
>> accelerate the operation.
>>
>> Bulk CPU hotplug operations—such as switching SMT modes across all
>> cores—require hotplugging multiple CPUs in rapid succession. On large
>> systems, this process takes significant time, increasing as the number
>> of CPUs to hotplug during SMT switch grows, leading to substantial
>> delays on high-core-count machines. Analysis [1] reveals that the
>> majority of this time is spent waiting for synchronize_rcu().
>>
> You seem to have left out all the useful bits from your changelog again
> :/
>
> Anyway, ISTR Joel posted a patch hoisting a lock; it was a icky, but not
> something we can't live with either.
>
> Also, memory got jogged and I think something like the below will remove
> 2/3 of your rcu woes as well.
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 8df2d773fe3b..1365c19444b2 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -2669,6 +2669,7 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
>   	int cpu, ret = 0;
>   
>   	cpu_maps_update_begin();
> +	rcu_sync_enter(&cpu_hotplug_lock.rss);
>   	for_each_online_cpu(cpu) {
>   		if (topology_is_primary_thread(cpu))
>   			continue;
> @@ -2698,6 +2699,7 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
>   	}
>   	if (!ret)
>   		cpu_smt_control = ctrlval;
> +	rcu_sync_exit(&cpu_hotplug_lock.rss);
>   	cpu_maps_update_done();
>   	return ret;
>   }
> @@ -2715,6 +2717,7 @@ int cpuhp_smt_enable(void)
>   	int cpu, ret = 0;
>   
>   	cpu_maps_update_begin();
> +	rcu_sync_enter(&cpu_hotplug_lock.rss);
>   	cpu_smt_control = CPU_SMT_ENABLED;
>   	for_each_present_cpu(cpu) {
>   		/* Skip online CPUs and CPUs on offline nodes */
> @@ -2728,6 +2731,7 @@ int cpuhp_smt_enable(void)
>   		/* See comment in cpuhp_smt_disable() */
>   		cpuhp_online_cpu_device(cpu);
>   	}
> +	rcu_sync_exit(&cpu_hotplug_lock.rss);
>   	cpu_maps_update_done();
>   	return ret;
>   }


Hi,

I verified this patch using the configuration described below.
Configuration:
     •    Kernel version: 6.19.0-rc6
     •    Number of CPUs: 1536

Earlier verification of an older version of this patch was performed on 
a system with *2048 CPUs*. Due to system unavailability, the current 
verification was carried out on a *different system.*


Using this setup, I evaluated the patch with both SMT enabled and SMT 
disabled. patch shows a significant improvement in the SMT=off case and 
a measurable improvement in the SMT=on case.
The results indicate that when SMT is enabled, the system time is 
noticeably higher. In contrast, with SMT disabled, no significant 
increase in system time is observed.

SMT=ON  -> sys 50m42.805s
SMT=OFF -> sys 0m0.064s


SMT Mode    | Without Patch    | With Patch   | % Improvement   |
------------------------------------------------------------------
SMT=off     | 20m 32.210s      |  5m 30.898s  | +73.15%         |
SMT=on      | 62m 46.549s      | 55m 45.671s  | +11.18%         |


Please add below tag: 
Tested-by: Samir M <samir@...ux.ibm.com>

Regards,
Samir


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ