lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEXW_YQSxGiLT8nPGRWm1v9+4X+N9+JpL+fq+hOH4BpSUCkcXQ@mail.gmail.com>
Date: Fri, 24 Jan 2025 15:21:53 -0500
From: Joel Fernandes <joel@...lfernandes.org>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: paulmck@...nel.org, rcu@...r.kernel.org, linux-kernel@...r.kernel.org, 
	kernel-team@...a.com, rostedt@...dmis.org
Subject: Re: [PATCH RFC rcu] Fix get_state_synchronize_rcu_full() GP-start detection

On Fri, Jan 24, 2025 at 9:56 AM Frederic Weisbecker <frederic@...nel.org> wrote:
>
> Le Thu, Jan 23, 2025 at 08:49:47PM -0500, Joel Fernandes a écrit :
> > On Thu, Dec 12, 2024 at 7:59 PM Paul E. McKenney <paulmck@...nel.org> wrote:
> > >
> > > The get_state_synchronize_rcu_full() and poll_state_synchronize_rcu_full()
> > > functions use the root rcu_node structure's ->gp_seq field to detect
> > > the beginnings and ends of grace periods, respectively.  This choice is
> > > necessary for the poll_state_synchronize_rcu_full() function because
> > > (give or take counter wrap), the following sequence is guaranteed not
> > > to trigger:
> > >
> > >         get_state_synchronize_rcu_full(&rgos);
> > >         synchronize_rcu();
> > >         WARN_ON_ONCE(!poll_state_synchronize_rcu_full(&rgos));
> > >
> > > The RCU callbacks that awaken synchronize_rcu() instances are
> > > guaranteed not to be invoked before the root rcu_node structure's
> > > ->gp_seq field is updated to indicate the end of the grace period.
> > > However, these callbacks might start being invoked immediately
> > > thereafter, in particular, before rcu_state.gp_seq has been updated.
> > > Therefore, poll_state_synchronize_rcu_full() must refer to the
> > > root rcu_node structure's ->gp_seq field.  Because this field is
> > > updated under this structure's ->lock, any code following a call to
> > > poll_state_synchronize_rcu_full() will be fully ordered after the
> > > full grace-period computation, as is required by RCU's memory-ordering
> > > semantics.
> > >
> > > By symmetry, the get_state_synchronize_rcu_full() function should also
> > > use this same root rcu_node structure's ->gp_seq field.  But it turns out
> > > that symmetry is profoundly (though extremely infrequently) destructive
> > > in this case.  To see this, consider the following sequence of events:
> > >
> > > 1.      CPU 0 starts a new grace period, and updates rcu_state.gp_seq
> > >         accordingly.
>
> I don't think so because idle CPUs are waited upon to report a QS, unlike
> offline CPUs that don't appear in ->qsmaskinitnext.
>
> If the CPU 1 is idle while the grace period kthread scans its
> ct_rcu_watching_cpu(), then the QS is reported on its behalf and when CPU 1
> goes out of idle it is guaranteed to see the new started GP on the root node.
>
> If the CPU 1 is not idle while the grace period kthread scans its
> ct_rcu_watching_cpu(), then CPU 1 must report a QS and that cancels the race.

Sorry, you are right. Idle CPUs are required to have QS reports done
for them or on their behalf. My bad.

thanks,

 - Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ