lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 26 Jun 2022 14:37:22 +0000
From:   Joel Fernandes <joel@...lfernandes.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     rcu@...r.kernel.org, linux-kernel@...r.kernel.org,
        rushikesh.s.kadam@...el.com, urezki@...il.com,
        neeraj.iitr10@...il.com, frederic@...nel.org, rostedt@...dmis.org,
        vineeth@...byteword.org
Subject: Re: [PATCH v2 5/8] rcu/nocb: Wake up gp thread when flushing

On Sun, Jun 26, 2022 at 06:52:40AM -0700, Paul E. McKenney wrote:
> On Sun, Jun 26, 2022 at 01:45:32PM +0000, Joel Fernandes wrote:
> > On Sat, Jun 25, 2022 at 09:06:22PM -0700, Paul E. McKenney wrote:
> > > On Wed, Jun 22, 2022 at 10:50:59PM +0000, Joel Fernandes (Google) wrote:
> > > > We notice that rcu_barrier() can take a really long time. It appears
> > > > that this can happen when all CBs are lazy and the timer does not fire
> > > > yet. So after flushing, nothing wakes up GP thread. This patch forces
> > > > GP thread to wake when bypass flushing happens, this fixes the
> > > > rcu_barrier() delays with lazy CBs.
> > > 
> > > I am wondering if there is a bug in non-rcu_barrier() lazy callback
> > > processing hiding here as well?
> > 
> > I don't think so because in both nocb_try_bypass and nocb_gp_wait, we are not
> > going to an indefinite sleep after the flush. However, with rcu_barrier() ,
> > there is nothing to keep the RCU GP thread awake. That's my theory at least.
> > In practice, I have not been able to reproduce this issue with
> > non-rcu_barrier().
> > 
> > With rcu_barrier() I happen to hit it thanks to the rcuscale changes I did.
> > That's an interesting story! As I apply call_rcu_lazy() to the file table
> > code, turns out that on boot, the initram unpacking code continously triggers
> > call_rcu_lazy(). This happens apparently in a different thread than the one
> > that rcuscale is running in. In rcuscale, I did rcu_barrier() at init time
> > and this stalled for a long time to my surprise, and this patch fixed it.
> 
> Cool!
> 
> Then should this wake_nocb_gp() instead go into the rcu_barrier()
> code path?  As shown below, wouldn't we be doing some spurious wakeups?

You are right. In my testing, I don't see any issue with the extra wake up
which is going to happen anyway and my thought was why not do it so that a
future bypass flush from some other path forgets to call wake up.

I'll refine it to be rcu-barrier-only then.

thanks!

 - Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ