linux-kernel - Re: [syzbot] KCSAN: data-race in call_rcu / rcu_gp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20210720163723.GH4397@paulmck-ThinkPad-P17-Gen-1>
Date:   Tue, 20 Jul 2021 09:37:23 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Marco Elver <elver@...gle.com>
Cc:     syzbot <syzbot+e08a83a1940ec3846cd5@...kaller.appspotmail.com>,
        rcu@...r.kernel.org, linux-kernel@...r.kernel.org,
        syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] KCSAN: data-race in call_rcu / rcu_gp_kthread

On Tue, Jul 20, 2021 at 04:10:10PM +0200, Marco Elver wrote:
> On Tue, 20 Jul 2021 at 15:18, Paul E. McKenney <paulmck@...nel.org> wrote:
> [...]
> > Good catch!  And yes, this would be hard to reproduce.
> >
> > How about as shown below?
> 
> Acked-by: Marco Elver <elver@...gle.com>

I will apply on the next rebase, thank you!

> I was merely a little surprised syzbot was able to exercise RCU in a
> way that resulted in a data race your torture runs hadn't found yet
> (or perhaps it did and missed?).

My KCSAN runs are necessarily quite short because I do a normal, KASAN,
and KCSAN variant of each scenario of each torture test, with the
constraint that it all run overnight.

So there are probably more to find.  ;-)

							Thanx, Paul

> Thanks,
> -- Marco
> 
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit 43e0f01f3b6f510dbe31d02a8f4c909c45deff04
> > Author: Paul E. McKenney <paulmck@...nel.org>
> > Date:   Tue Jul 20 06:16:27 2021 -0700
> >
> >     rcu: Mark accesses to rcu_state.n_force_qs
> >
> >     This commit marks accesses to the rcu_state.n_force_qs.  These data
> >     races are hard to make happen, but syzkaller was equal to the task.
> >
> >     Reported-by: syzbot+e08a83a1940ec3846cd5@...kaller.appspotmail.com
> >     Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index a7379c44a2366..245bca7cdf6ee 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1913,7 +1913,7 @@ static void rcu_gp_fqs(bool first_time)
> >         struct rcu_node *rnp = rcu_get_root();
> >
> >         WRITE_ONCE(rcu_state.gp_activity, jiffies);
> > -       rcu_state.n_force_qs++;
> > +       WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1);
> >         if (first_time) {
> >                 /* Collect dyntick-idle snapshots. */
> >                 force_qs_rnp(dyntick_save_progress_counter);
> > @@ -2556,7 +2556,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
> >         /* Reset ->qlen_last_fqs_check trigger if enough CBs have drained. */
> >         if (count == 0 && rdp->qlen_last_fqs_check != 0) {
> >                 rdp->qlen_last_fqs_check = 0;
> > -               rdp->n_force_qs_snap = rcu_state.n_force_qs;
> > +               rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> >         } else if (count < rdp->qlen_last_fqs_check - qhimark)
> >                 rdp->qlen_last_fqs_check = count;
> >
> > @@ -2904,10 +2904,10 @@ static void __call_rcu_core(struct rcu_data *rdp, struct rcu_head *head,
> >                 } else {
> >                         /* Give the grace period a kick. */
> >                         rdp->blimit = DEFAULT_MAX_RCU_BLIMIT;
> > -                       if (rcu_state.n_force_qs == rdp->n_force_qs_snap &&
> > +                       if (READ_ONCE(rcu_state.n_force_qs) == rdp->n_force_qs_snap &&
> >                             rcu_segcblist_first_pend_cb(&rdp->cblist) != head)
> >                                 rcu_force_quiescent_state();
> > -                       rdp->n_force_qs_snap = rcu_state.n_force_qs;
> > +                       rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> >                         rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist);
> >                 }
> >         }
> > @@ -4134,7 +4134,7 @@ int rcutree_prepare_cpu(unsigned int cpu)
> >         /* Set up local state, ensuring consistent view of global state. */
> >         raw_spin_lock_irqsave_rcu_node(rnp, flags);
> >         rdp->qlen_last_fqs_check = 0;
> > -       rdp->n_force_qs_snap = rcu_state.n_force_qs;
> > +       rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> >         rdp->blimit = blimit;
> >         rdp->dynticks_nesting = 1;      /* CPU not up, no tearing. */
> >         rcu_dynticks_eqs_online();