linux-kernel - Re: [PATCH 03/11] rcu/nocb: Invoke rcu_core() at the start of deoffloading

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20211004124141.GA272717@lothringen>
Date:   Mon, 4 Oct 2021 14:41:41 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     "Paul E . McKenney" <paulmck@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Uladzislau Rezki <urezki@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Boqun Feng <boqun.feng@...il.com>,
        Neeraj Upadhyay <neeraju@...eaurora.org>,
        Josh Triplett <josh@...htriplett.org>,
        Joel Fernandes <joel@...lfernandes.org>, rcu@...r.kernel.org
Subject: Re: [PATCH 03/11] rcu/nocb: Invoke rcu_core() at the start of
 deoffloading

On Fri, Oct 01, 2021 at 06:50:04PM +0100, Valentin Schneider wrote:
> On 30/09/21 00:10, Frederic Weisbecker wrote:
> > On PREEMPT_RT, if rcu_core() is preempted by the de-offloading process,
> > some work, such as callbacks acceleration and invocation, may be left
> > unattended due to the volatile checks on the offloaded state.
> >
> > In the worst case this work is postponed until the next rcu_pending()
> > check that can take a jiffy to reach, which can be a problem in case
> > of callbacks flooding.
> >
> > Solve that with invoking rcu_core() early in the de-offloading process.
> > This way any work dismissed by an ongoing rcu_core() call fooled by
> > a preempting deoffloading process will be caught up by a nearby future
> > recall to rcu_core(), this time fully aware of the de-offloading state.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > Cc: Valentin Schneider <valentin.schneider@....com>
> > Cc: Peter Zijlstra <peterz@...radead.org>
> > Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> > Cc: Josh Triplett <josh@...htriplett.org>
> > Cc: Joel Fernandes <joel@...lfernandes.org>
> > Cc: Boqun Feng <boqun.feng@...il.com>
> > Cc: Neeraj Upadhyay <neeraju@...eaurora.org>
> > Cc: Uladzislau Rezki <urezki@...il.com>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> 
> One comment/question below.
> 
> > @@ -990,6 +990,15 @@ static long rcu_nocb_rdp_deoffload(void *arg)
> >        * will refuse to put anything into the bypass.
> >        */
> >       WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
> > +	/*
> > +	 * Start with invoking rcu_core() early. This way if the current thread
> > +	 * happens to preempt an ongoing call to rcu_core() in the middle,
> > +	 * leaving some work dismissed because rcu_core() still thinks the rdp is
> > +	 * completely offloaded, we are guaranteed a nearby future instance of
> > +	 * rcu_core() to catch up.
> > +	 */
> > +	rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE);
> > +	invoke_rcu_core();
> 
> I think your approach is a bit neater, but would there have been any issue
> with keeping the setting of SEGCBLIST_RCU_CORE within
> rcu_segcblist_offload() and bundling it with an invoke_rcu_core()?

Probably not in practice.

But in theory, it may be more comfortable to read the following in order:

1) Set SEGCBLIST_RCU_CORE so subsequent invocations of rcu_core() handle
callbacks

2) Invoke rcu_core()

3) Only once we achieved the above we can clear SEGCBLIST_OFFLOADED which
will stop the nocb kthreads.

If we did 3) first and only then 1) and 2), there would be a risk that callbacks
get completely ignored in the middle.

That said you have a point in that we could do:

1) Set SEGCBLIST_RCU_CORE and clear SEGCBLIST_OFFLOADED at the _very_ same time
(arrange that with a WRITE_ONCE() I guess).

2) Invoke rcu_core()

But well...arranging for rcu_core() to take over before we even consider
starting the de-offloading process provides some unexplainable relief to the
soul. Some code design sometimes rely more on faith than logic :)

Thanks.