lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 15 Mar 2021 17:02:23 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Boqun Feng <boqun.feng@...il.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Neeraj Upadhyay <neeraju@...eaurora.org>,
        Josh Triplett <josh@...htriplett.org>,
        Stable <stable@...r.kernel.org>,
        Joel Fernandes <joel@...lfernandes.org>
Subject: Re: [PATCH 05/13] rcu/nocb: Use the rcuog CPU's ->nocb_timer

On Wed, Mar 10, 2021 at 11:05:07PM +0100, Frederic Weisbecker wrote:
> On Tue, Mar 02, 2021 at 05:15:57PM -0800, Paul E. McKenney wrote:
> > The first question is of course: Did you try this with lockdep enabled?  ;-)
> 
> Yep I always do. But I may miss some configs on my testings. I usually
> test at least TREE01 on x86 and arm64.
> 
> > > @@ -1702,43 +1692,50 @@ bool rcu_is_nocb_cpu(int cpu)
> > >  	return false;
> > >  }
> > >  
> > > -/*
> > > - * Kick the GP kthread for this NOCB group.  Caller holds ->nocb_lock
> > > - * and this function releases it.
> > > - */
> > > -static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
> > > -			 unsigned long flags)
> > > -	__releases(rdp->nocb_lock)
> > > +static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
> > > +			   struct rcu_data *rdp,
> > > +			   bool force, unsigned long flags)
> > > +	__releases(rdp_gp->nocb_gp_lock)
> > >  {
> > >  	bool needwake = false;
> > > -	struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
> > >  
> > > -	lockdep_assert_held(&rdp->nocb_lock);
> > >  	if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) {
> > > -		rcu_nocb_unlock_irqrestore(rdp, flags);
> > > +		raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
> > >  		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
> > >  				    TPS("AlreadyAwake"));
> > >  		return false;
> > >  	}
> > >  
> > > -	if (READ_ONCE(rdp->nocb_defer_wakeup) > RCU_NOCB_WAKE_NOT) {
> > > -		WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
> > > -		del_timer(&rdp->nocb_timer);
> > > +	if (rdp_gp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
> > 
> > So there are no longer any data races involving ->nocb_defer_wakeup?
> > 
> > (Yes, I could fire up KCSAN, but my KCSAN-capable system is otherwise
> > occupied for several more hours.)
> 
> To be more specific, there is no more unlocked write to the timer (queue/cancel)
> and its nocb_defer_wakeup matching state. And there is only one (on purpose) racy
> reader of ->nocb_defer_wakeup which is the non-timer deferred wakeup.
> 
> So the writes to the timer keep their WRITE_ONCE() and only the reader in
> do_nocb_deferred_wakeup() keeps its READ_ONCE(). Other readers are protected
> by the ->nocb_gp_lock.
> 
> > > +
> > >  		// Advance callbacks if helpful and low contention.
> > >  		needwake_gp = false;
> > >  		if (!rcu_segcblist_restempty(&rdp->cblist,
> > > @@ -2178,11 +2182,18 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
> > >  	my_rdp->nocb_gp_bypass = bypass;
> > >  	my_rdp->nocb_gp_gp = needwait_gp;
> > >  	my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
> > > -	if (bypass && !rcu_nocb_poll) {
> > > -		// At least one child with non-empty ->nocb_bypass, so set
> > > -		// timer in order to avoid stranding its callbacks.
> > > +	if (bypass) {
> > >  		raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
> > > -		mod_timer(&my_rdp->nocb_bypass_timer, j + 2);
> > > +		// Avoid race with first bypass CB.
> > > +		if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
> > > +			WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
> > > +			del_timer(&my_rdp->nocb_timer);
> > > +		}
> > 
> > Given that the timer does not get queued if rcu_nocb_poll, why not move the
> > above "if" statement under the one following?
> 
> It's done later in the set.
> 
> > 
> > > +		if (!rcu_nocb_poll) {
> > > +			// At least one child with non-empty ->nocb_bypass, so set
> > > +			// timer in order to avoid stranding its callbacks.
> > > +			mod_timer(&my_rdp->nocb_bypass_timer, j + 2);
> > > +		}
> > >  		raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
> > >  	}
> > >  	if (rcu_nocb_poll) {
> > > @@ -2385,7 +2399,10 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
> > >   */
> > >  static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
> > >  {
> > > -	if (rcu_nocb_need_deferred_wakeup(rdp))
> > > +	if (!rdp->nocb_gp_rdp)
> > > +		return false;
> > 
> > This check was not necessary previously because each CPU used its own rdp,
> > correct?
> 
> Exactly!
> 
> > The theory is that this early return is taken only during boot,
> > and that the spawning of the kthreads will act as an implicit wakeup?
> 
> You guessed right! That probably deserve a comment.

OK, I have queued these for for further review and testing.  Also to
look at the overall effect.  Thank you very much!

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ