linux-kernel - RE: [PATCH v3] rcu: Remove impossible wakeup rcu GP kthread action from rcu_report_qs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PH0PR11MB58803DFEDD61EE66AF49074EDACA9@PH0PR11MB5880.namprd11.prod.outlook.com>
Date:   Sat, 21 Jan 2023 00:38:35 +0000
From:   "Zhang, Qiang1" <qiang1.zhang@...el.com>
To:     Joel Fernandes <joel@...lfernandes.org>
CC:     "Paul E. McKenney" <paulmck@...nel.org>,
        "frederic@...nel.org" <frederic@...nel.org>,
        "quic_neeraju@...cinc.com" <quic_neeraju@...cinc.com>,
        "rcu@...r.kernel.org" <rcu@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3] rcu: Remove impossible wakeup rcu GP kthread action
 from rcu_report_qs_rdp()


> On Jan 20, 2023, at 3:19 AM, Zhang, Qiang1 <qiang1.zhang@...el.com> wrote:
> 
> 
>> 
>> 
>>>> On Wed, Jan 18, 2023 at 03:30:14PM +0800, Zqiang wrote:
>>>>> When inovke rcu_report_qs_rdp(), if current CPU's rcu_data structure's ->
>>>>> grpmask has not been cleared from the corresponding rcu_node structure's
>>>>> ->qsmask, after that will clear and report quiescent state, but in this
>>>>> time, this also means that current grace period is not end, the current
>>>>> grace period is ongoing, because the rcu_gp_in_progress() currently return
>>>>> true, so for non-offloaded rdp, invoke rcu_accelerate_cbs() is impossible
>>>>> to return true.
>>>>> 
>>>>> This commit therefore remove impossible rcu_gp_kthread_wake() calling.
>>>>> 
>>>>> Signed-off-by: Zqiang <qiang1.zhang@...el.com>
>>>>> Reviewed-by: Frederic Weisbecker <frederic@...nel.org>
>>> 
>>> Queued (wordsmithed as shown below, as always, please check) for further
>>> testing and review, thank you both!
>>> 
>>>                                                      Thanx, Paul
>>> 
>>> ------------------------------------------------------------------------
>>> 
>>> commit fbe3e300ec8b3edd2b8f84dab4dc98947cf71eb8
>>> Author: Zqiang <qiang1.zhang@...el.com>
>>> Date:   Wed Jan 18 15:30:14 2023 +0800
>>> 
>>>    rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
>>> 
>>>    The rcu_accelerate_cbs() function is invoked by rcu_report_qs_rdp()
>>>    only if there is a grace period in progress that is still blocked
>>>    by at least one CPU on this rcu_node structure.  This means that
>>>    rcu_accelerate_cbs() should never return the value true, and thus that
>>>    this function should never set the needwake variable and in turn never
>>>    invoke rcu_gp_kthread_wake().
>>> 
>>>    This commit therefore removes the needwake variable and the invocation
>>>    of rcu_gp_kthread_wake() in favor of a WARN_ON_ONCE() on the call to
>>>    rcu_accelerate_cbs().  The purpose of this new WARN_ON_ONCE() is to
>>>    detect situations where the system's opinion differs from ours.
>>> 
>>>    Signed-off-by: Zqiang <qiang1.zhang@...el.com>
>>>    Reviewed-by: Frederic Weisbecker <frederic@...nel.org>
>>>    Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
>>> 
>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>>> index b2c2045294780..7a3085ad0a7df 100644
>>> --- a/kernel/rcu/tree.c
>>> +++ b/kernel/rcu/tree.c
>>> @@ -1956,7 +1956,6 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
>>> {
>>>      unsigned long flags;
>>>      unsigned long mask;
>>> -     bool needwake = false;
>>>      bool needacc = false;
>>>      struct rcu_node *rnp;
>>> 
>>> @@ -1988,7 +1987,12 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
>>>               * NOCB kthreads have their own way to deal with that...
>>>               */
>>>              if (!rcu_rdp_is_offloaded(rdp)) {
>>> -                     needwake = rcu_accelerate_cbs(rnp, rdp);
>>> +                     /*
>>> +                      * The current GP has not yet ended, so it
>>> +                      * should not be possible for rcu_accelerate_cbs()
>>> +                      * to return true.  So complain, but don't awaken.
>>> +                      */
>>> +                     WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp));
>>>              } else if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
>>>                      /*
>>>                       * ...but NOCB kthreads may miss or delay callbacks acceleration
>>> @@ -2000,8 +2004,6 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
>>>              rcu_disable_urgency_upon_qs(rdp);
>>>              rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
>>>              /* ^^^ Released rnp->lock */
>>> -             if (needwake)
>>> -                     rcu_gp_kthread_wake();
>>> 
>>> AFAICS, there is almost no compiler benefit of doing this, and zero runtime
>>> benefit of doing this. The WARN_ON_ONCE() also involves a runtime condition
>>> check of the return value of rcu_accelerate_cbs(), so you still have a
>>> branch. Yes, maybe slightly smaller code without the wake call, but I'm not
>>> sure that is worth it.
>>> 
>>> And, if the opinion of system differs, its a bug anyway, so more added risk.
>>> 
>>> 
>>> 
>>>              if (needacc) {
>>>                      rcu_nocb_lock_irqsave(rdp, flags);
>>> 
>>> And when needacc = true, rcu_accelerate_cbs_unlocked() tries to do a wake up
>>> anyway, so it is consistent with nocb vs !nocb.
>> 
>> For !nocb, we invoked rcu_accelerate_cbs() before report qs,  so this GP is impossible to end
>> and we also not set RCU_GP_FLAG_INIT to start new GP in rcu_accelerate_cbs().
>> but for nocb, when needacc = true, we invoke rcu_accelerate_cbs_unlocked() after current CPU
>> has reported qs,  if all CPU have been reported qs,  we will wakeup gp kthread to end this GP in
>> rcu_report_qs_rnp().   after that, the rcu_accelerate_cbs_unlocked() is  possible to try to wake up
>> gp kthread if this GP has ended at this time.   so nocb vs !nocb is likely to be inconsistent.
>> 
>> 
>> That is a fair point. But after gp ends,  rcu_check_quiescent_state()
>> -> note_gp_changes() which will do a accel + GP thread wake up at that
>> point anyway, once it notices that a GP has come to an end. That
>> should happen for both the nocb and !nocb cases right?
> 
> For nocb rdp, we won't invoke rcu_accelerate_cbs() and rcu_advance_cbs() in
> note_gp_changes().  so also not wakeup gp kthread in note_gp_changes(). 
>
>Yes correct, ok but…
> 
>> 
>> I am wondering if rcu_report_qs_rdp() needs to be rethought to make
>> both cases consistent.
>> 
>> Why does the nocb case need an accel + GP thread wakeup in the
>> rcu_report_qs_rdp() function, but the !nocb case does not?
> 
> For nocb accel + GP kthread wakeup only happen in the middle of a (de-)offloading process.
> this is an intermediate state.
>
>Sure, I know what the code currently does, I am asking why and it feels wrong.
>
>I suggest you slightly change your approach to not assuming the code should be bonafide 
>correct and then fixing it (which is ok once in a while), and asking higher level questions
>to why things are the way they are in the first place (that is just my suggestion and I am not in
>a place to provide advice, far from it, but I am just telling you my approach — I care more about
>the code than increasing my patch count :P).
>

Thanks Joel, this is a useful suggestion for me.

>
>If you are in an intermediate state, part way to a !nocb state — 
>you may have missed a nocb-related accel and wake, correct? Why does that matter? 
>Once we transition to a !nocb state, we do not do a post-qs-report accel+wake anyway

Should it be transition to a !nocb state, we do a post-qs-report accel+wake.

>as we clearly know from the discussion. So why do we need to do it if we missed it for
>the intermediate stage? So, I am not fully sure yet what that needac is doing and why it is needed.

For de-offloading, when in the process of de-offloading(rcu_segcblist_completely_offloaded() return false),
we're not doing bypass even though this rdp is offloaded state(rcu_rdp_is_offloaded(rdp) return true),
at this time, the rcuog kthread probably won't accel+wakeup, so we do accel+wakeup in rcu_report_qs_rdp(),
as you say why does that matter?  for !nocb state,  we've always tried to accel+wakeup as
much as possible(compared to nocb), let rcu callback be executed as soon as possible.

This is just my personal opinion, please correct me if I am wrong.

Thanks
Zqiang


>
>Do not get me wrong, stellar work here. But I suggest challenge the assumptions and the design, not always just the code that was already written :), apologies for any misplaced or noisy advice.
>
>Thanks!
>
> - Joel


>    
> Thanks
> Zqiang
> 
>> 
>> (I am out of office till Monday but will intermittently (maybe) check
>> in, RCU is one of those things that daydreaming tends to lend itself
>> to...)
>> 
>> - Joel