lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z68yzBURiIr_7Lmy@pavilion.home>
Date: Fri, 14 Feb 2025 13:10:52 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Boqun Feng <boqun.feng@...il.com>,
	Joel Fernandes <joel@...lfernandes.org>,
	Neeraj Upadhyay <neeraj.upadhyay@....com>,
	Uladzislau Rezki <urezki@...il.com>,
	Zqiang <qiang.zhang1211@...il.com>, rcu <rcu@...r.kernel.org>
Subject: Re: [PATCH 3/3] rcu/exp: Remove needless CPU up quiescent state
 report

Le Fri, Feb 14, 2025 at 01:01:56AM -0800, Paul E. McKenney a écrit :
> On Fri, Feb 14, 2025 at 12:25:59AM +0100, Frederic Weisbecker wrote:
> > A CPU coming online checks for an ongoing grace period and reports
> > a quiescent state accordingly if needed. This special treatment that
> > shortcuts the expedited IPI finds its origin as an optimization purpose
> > on the following commit:
> > 
> > 	338b0f760e84 (rcu: Better hotplug handling for synchronize_sched_expedited()
> > 
> > The point is to avoid an IPI while waiting for a CPU to become online
> > or failing to become offline.
> > 
> > However this is pointless and even error prone for several reasons:
> > 
> > * If the CPU has been seen offline in the first round scanning offline
> >   and idle CPUs, no IPI is even tried and the quiescent state is
> >   reported on behalf of the CPU.
> > 
> > * This means that if the IPI fails, the CPU just became offline. So
> >   it's unlikely to become online right away, unless the cpu hotplug
> >   operation failed and rolled back, which is a rare event that can
> >   wait a jiffy for a new IPI to be issued.
> > 
> > * But then the "optimization" applying on failing CPU hotplug down only
> >   applies to !PREEMPT_RCU.
> > 
> > * This force reports a quiescent state even if ->cpu_no_qs.b.exp is not
> >   set. As a result it can race with remote QS reports on the same rdp.
> >   Fortunately it happens to be OK but an accident is waiting to happen.
> > 
> > For all those reasons, remove this optimization that doesn't look worthy
> > to keep around.
> 
> Thank you for digging into this!
> 
> When I ran tests that removed the call to sync_sched_exp_online_cleanup()
> a few months ago, I got grace-period hangs [1].  Has something changed
> to make this safe?

Hmm, but was it before or after "rcu: Fix get_state_synchronize_rcu_full()
GP-start detection" ?

And if after do we know why?

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ