lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 31 Jan 2020 20:20:17 +0100
From:   Marco Elver <elver@...gle.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: Confused about hlist_unhashed_lockless()

On Fri, 31 Jan 2020 at 19:48, Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Fri, Jan 31, 2020 at 10:43 AM Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Fri, Jan 31, 2020 at 08:48:05AM -0800, Eric Dumazet wrote:
> > >     BUG: KCSAN: data-race in del_timer / detach_if_pending
> >
> > > diff --git a/include/linux/timer.h b/include/linux/timer.h
> > > index 1e6650ed066d5d28251b0bd385fc37ef94c96532..0dc19a8c39c9e49a7cde3d34bfa4be8871cbc1c2
> > > 100644
> > > --- a/include/linux/timer.h
> > > +++ b/include/linux/timer.h
> > > @@ -164,7 +164,7 @@ static inline void destroy_timer_on_stack(struct
> > > timer_list *timer) { }
> > >   */
> > >  static inline int timer_pending(const struct timer_list * timer)
> > >  {
> > > - return timer->entry.pprev != NULL;
> > > + return !hlist_unhashed_lockless(&timer->entry);
> > >  }
> >
> > That's just completely wrong.
> >
> > Aside from any memory barrier issues that might or might not be there
> > (I'm waaaay to tired atm to tell), the above code is perfectly fine.
> >
> > In fact, this is a KCSAN compiler infrastructure 'bug'.
> >
> > Any load that is only compared to zero is immune to load-tearing issues.

We're not just fighting load-tearing, but also other compiler
optimizations. Eric pointed out the loop case:

while (!timer_pending(..)) { }

Without a READ_ONCE, the compiler can turn this into:

if (!timer_pending(..)) {
  while (true) { }
}

So, unless we can enumerate all possible cases in which a compare to 0
data race can occur, and prove they're all safe for all compiler
optimizations, could we plase not claim it's a general KCSAN compiler
infrastructure bug.

This also makes me second-guess if the osq_lock case is safe (looking
at it, I think it's still safe, but only because the surrounding ops
imply compiler barriers and the compiler then won't mess up the loop).

> > The correct thing to do here is something like:
> >
> > static inline int timer_pending(const struct timer_list *timer)
> > {
> >         /*
> >          * KCSAN compiler infrastructure is insuffiently clever to
> >          * realize that a 'load compared to zero' is immune to
> >          * load-tearing.
> >          */

I could agree with this if we knew the context in which this is used.
Unfortunately we do not know the context (unlike osq_lock), and for
all we know it could be used in a loop.

> >         return data_race(timer->entry.pprev != NULL);
> > }
>
>
> This is nice, now with have data_race()

Again, if you use 'data_race()' you'll have to prove that it'll be
safe in all cases, for all compiler versions current and present. The
READ_ONCE would take that burden from you.

Thanks,
-- Marco

> Remember these patches were sent 2 months ago, at a time we were
> trying to sort out things.
>
> data_race() was merged a few days ago.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ