lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260119141118.GF830229@noisy.programming.kicks-ass.net>
Date: Mon, 19 Jan 2026 15:11:18 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: Vishal Chourasia <vishalc@...ux.ibm.com>, boqun.feng@...il.com,
	frederic@...nel.org, joelagnelf@...dia.com, josh@...htriplett.org,
	linux-kernel@...r.kernel.org, neeraj.upadhyay@...nel.org,
	paulmck@...nel.org, rcu@...r.kernel.org, rostedt@...dmis.org,
	srikar@...ux.ibm.com, tglx@...utronix.de, urezki@...il.com,
	samir@...ux.ibm.com
Subject: Re: [PATCH] cpuhp: Expedite synchronize_rcu during SMT switch

On Mon, Jan 19, 2026 at 07:15:09PM +0530, Shrikanth Hegde wrote:
> Hi Peter.
> 
> On 1/19/26 5:13 PM, Peter Zijlstra wrote:
> > On Mon, Jan 19, 2026 at 04:17:40PM +0530, Vishal Chourasia wrote:
> > > Expedite synchronize_rcu() during the cpuhp_smt_[enable|disable] path to
> > > accelerate the operation.
> > > 
> > > Bulk CPU hotplug operations—such as switching SMT modes across all
> > > cores—require hotplugging multiple CPUs in rapid succession. On large
> > > systems, this process takes significant time, increasing as the number
> > > of CPUs to hotplug during SMT switch grows, leading to substantial
> > > delays on high-core-count machines. Analysis [1] reveals that the
> > > majority of this time is spent waiting for synchronize_rcu().
> > > 
> > 
> > You seem to have left out all the useful bits from your changelog again
> > :/
> > 
> > Anyway, ISTR Joel posted a patch hoisting a lock; it was a icky, but not
> > something we can't live with either.
> > 
> > Also, memory got jogged and I think something like the below will remove
> > 2/3 of your rcu woes as well.
> > 
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 8df2d773fe3b..1365c19444b2 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -2669,6 +2669,7 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> >   	int cpu, ret = 0;
> >   	cpu_maps_update_begin();
> > +	rcu_sync_enter(&cpu_hotplug_lock.rss);
> >   	for_each_online_cpu(cpu) {
> >   		if (topology_is_primary_thread(cpu))
> >   			continue;
> > @@ -2698,6 +2699,7 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> >   	}
> >   	if (!ret)
> >   		cpu_smt_control = ctrlval;
> > +	rcu_sync_exit(&cpu_hotplug_lock.rss);
> >   	cpu_maps_update_done();
> >   	return ret;
> >   }
> > @@ -2715,6 +2717,7 @@ int cpuhp_smt_enable(void)
> >   	int cpu, ret = 0;
> >   	cpu_maps_update_begin();
> > +	rcu_sync_enter(&cpu_hotplug_lock.rss);
> >   	cpu_smt_control = CPU_SMT_ENABLED;
> >   	for_each_present_cpu(cpu) {
> >   		/* Skip online CPUs and CPUs on offline nodes */
> > @@ -2728,6 +2731,7 @@ int cpuhp_smt_enable(void)
> >   		/* See comment in cpuhp_smt_disable() */
> >   		cpuhp_online_cpu_device(cpu);
> >   	}
> > +	rcu_sync_exit(&cpu_hotplug_lock.rss);
> >   	cpu_maps_update_done();
> >   	return ret;
> >   }
> 
> 
> Currently, cpuhp_smt_[enable/disable] calls _cpu_up/_cpu_down
> which does the same in cpus_write_lock/unlock. though it is per
> cpu enable/disable one after another.
> 
> How hoisting this up will help?

By holding an extra rcu_sync reference, the percpu rwsem is kept into
the the slow path, avoiding the rcu-sync on down_write(), which was very
prevalent per this:

  https://lkml.kernel.org/r/aWU9HRcs4ghazIRg@linux.ibm.com



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ