lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201209021449.GK3025@boqun-archlinux>
Date:   Wed, 9 Dec 2020 10:14:49 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     "Paul E. McKenney" <paulmck@...nel.org>, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: One potential issue with concurrent execution of RCU callbacks...

Hi Frederic,

On Tue, Dec 08, 2020 at 11:04:38PM +0100, Frederic Weisbecker wrote:
> On Tue, Dec 08, 2020 at 10:24:09AM -0800, Paul E. McKenney wrote:
> > > It reduces the code scope running with BH disabled.
> > > Also narrowing down helps to understand what it actually protects.
> > 
> > I thought that you would call out unnecessarily delaying other softirq
> > handlers.  ;-)
> > 
> > But if such delays are a problem (and they might well be), then to
> > avoid them on non-rcu_nocb CPUs would instead/also require changing the
> > early-exit checks to check for other pending softirqs to the existing
> > checks involving time, need_resched, and idle.  At which point, entering and
> > exiting BH-disabled again doesn't help, other than your point about the
> > difference in BH-disabled scopes on rcu_nocb and non-rcu_nocb CPUs.
> 
> Wise observation!
> 
> > 
> > Would it make sense to exit rcu_do_batch() if more than some amount
> > of time had elapsed and there was some non-RCU softirq pending?
> > 
> > My guess is that the current tlimit checks in rcu_do_batch() make this
> > unnecessary.
> 
> Right and nobody has complained about it so far.
> 
> But I should add a comment explaining the reason for the BH-disabled
> section in my series.
> 

Some background for the original question: I'm revisiting the wait
context checking feature of lockdep (which can detect bugs like
acquiring a spinlock_t lock inside a raw_spinlock_t), I've post my first
version:

	https://lore.kernel.org/lkml/20201208103112.2838119-1-boqun.feng@gmail.com/	

, and will surely copy you in the next version ;-)

The reason I asked for the RCU callback context requirement is that we
have the virtual lock (rcu_callback_map) that marks a RCU callback
context, so if RCU callback contexts have special restrictions on the
locking usage inside, we can use the wait context checking to do the
check (like what I did in the patch #3 of the above series).

My current summary is that since in certain configs (use_softirq is true
and nocb is disabled) RCU callbacks are executed in a softirq context,
so the least requirement for any RCU callbacks is they need to obey the
rules in softirq contexts. And yes, I'm aware that in some configs, RCU
callbacks are not executed in a softirq context (sometimes, even the BH
is not disabled), but we need to make all the callback work in the
"worst" (or strictest) case (callbacks executing in softirq contexts).
Currently, the effect of using wait context for rcu_callback_map in my
patchset is that lockdep will complain if a RCU callback use a mutex or
other sleepable locks, but using spinlock_t (even in PREEMPT_RT) won't
cause lockdep to complain. Am I getting this correct?

Regards,
Boqun

> Thanks.
> 
> > 
> > Thoughts?
> > 
> > 							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ