lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191009022017.GA664@boqun-laptop.fareast.corp.microsoft.com>
Date:   Wed, 9 Oct 2019 10:20:17 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Marco Elver <elver@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Josh Triplett <josh@...htriplett.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>, rcu@...r.kernel.org
Subject: Re: [PATCH] rcu: Avoid to modify mask_ofl_ipi in
 sync_rcu_exp_select_node_cpus()

On Tue, Oct 08, 2019 at 01:01:21PM -0400, Joel Fernandes wrote:
> On Tue, Oct 08, 2019 at 06:35:45PM +0200, Marco Elver wrote:
> > On Tue, 8 Oct 2019 at 18:30, Joel Fernandes <joel@...lfernandes.org> wrote:
> > >
> > > On Tue, Oct 08, 2019 at 01:01:40PM +0800, Boqun Feng wrote:
> > > > "mask_ofl_ipi" is used for iterate CPUs which IPIs are needed to send
> > > > to, however in the IPI sending loop, "mask_ofl_ipi" along with another
> > > > variable "mask_ofl_test" might also get modified to record which CPU's
> > > > quiesent state can be reported by sync_rcu_exp_select_node_cpus(). Two
> > > > variables seems to be redundant for such a propose, so this patch clean
> > > > things a little by solely using "mask_ofl_test" for recording and
> > > > "mask_ofl_ipi" for iteration. This would improve the readibility of the
> > > > IPI sending loop in sync_rcu_exp_select_node_cpus().
> > > >
> > > > Signed-off-by: Boqun Feng <boqun.feng@...il.com>
> > > > ---
> > >
> > > Reviewed-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > >
> > > thanks,
> > >
> > >  - Joel
> > 
> > Acked-by: Marco Elver <elver@...gle.com>
> > 

Thank you both!

> > If this is the official patch for the fix to the KCSAN reported
> > data-race, it'd be great to include the tag:
> > Reported-by: syzbot+134336b86f728d6e55a0@...kaller.appspotmail.com
> > so the bot knows this was fixed.
> 
> It is just an optimization that got triggerred due to debugging of the
> reported issue but does (should) not fix the issue.
> 

Right.

> Boqun, are you going to be posting another patch which just uses mask_ofl_ipi
> in the for_each(..) loop? (without using _snap) as Paul suggested?
> 

IIUC, Paul already has this fix along with other ->expmask queued in his
dev branch:

	https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=dev&id=4e4fefe0630dcf7415d62e6d9171c8f209444376

, and with the proper "Reported-by" tag to give syzbot credit.

Regards,
Boqun

> Paul mentioned other places where rnp->expmask is locklessly accessed so I
> think that may be fixed separately (such as the stall-warning code). Paul,
> were you planning on fixing all such accesses together (other than this code)
> or should I look into it more? I guess for the stall case, KCSAN would have
> to trigger stalls to see those issues.
> 
> thanks,
> 
>  - Joel
> 
> > 
> > Thanks!
> > -- Marco
> > 
> > > >  kernel/rcu/tree_exp.h | 13 ++++++-------
> > > >  1 file changed, 6 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> > > > index 69c5aa64fcfd..212470018752 100644
> > > > --- a/kernel/rcu/tree_exp.h
> > > > +++ b/kernel/rcu/tree_exp.h
> > > > @@ -387,10 +387,10 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp)
> > > >               }
> > > >               ret = smp_call_function_single(cpu, rcu_exp_handler, NULL, 0);
> > > >               put_cpu();
> > > > -             if (!ret) {
> > > > -                     mask_ofl_ipi &= ~mask;
> > > > +             /* The CPU responses the IPI, and will report QS itself */
> > > > +             if (!ret)
> > > >                       continue;
> > > > -             }
> > > > +
> > > >               /* Failed, raced with CPU hotplug operation. */
> > > >               raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > >               if ((rnp->qsmaskinitnext & mask) &&
> > > > @@ -401,13 +401,12 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp)
> > > >                       schedule_timeout_uninterruptible(1);
> > > >                       goto retry_ipi;
> > > >               }
> > > > -             /* CPU really is offline, so we can ignore it. */
> > > > -             if (!(rnp->expmask & mask))
> > > > -                     mask_ofl_ipi &= ~mask;
> > > > +             /* CPU really is offline, and we need its QS to pass GP. */
> > > > +             if (rnp->expmask & mask)
> > > > +                     mask_ofl_test |= mask;
> > > >               raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > >       }
> > > >       /* Report quiescent states for those that went offline. */
> > > > -     mask_ofl_test |= mask_ofl_ipi;
> > > >       if (mask_ofl_test)
> > > >               rcu_report_exp_cpu_mult(rnp, mask_ofl_test, false);
> > > >  }
> > > > --
> > > > 2.23.0
> > > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ