lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 23 Sep 2019 16:54:09 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Will Deacon <will@...nel.org>, Marco Elver <elver@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        Alexander Potapenko <glider@...gle.com>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>,
        Paul Turner <pjt@...gle.com>, Daniel Axtens <dja@...ens.net>,
        Anatol Pomazau <anatol@...gle.com>,
        Andrea Parri <parri.andrea@...il.com>,
        Alan Stern <stern@...land.harvard.edu>,
        LKMM Maintainers -- Akira Yokosawa <akiyks@...il.com>,
        Nicholas Piggin <npiggin@...il.com>,
        Daniel Lustig <dlustig@...dia.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>
Subject: Re: Kernel Concurrency Sanitizer (KCSAN)

On Mon, Sep 23, 2019 at 10:21:38AM +0200, Dmitry Vyukov wrote:
> On Mon, Sep 23, 2019 at 6:31 AM Boqun Feng <boqun.feng@...il.com> wrote:
> >
> > On Fri, Sep 20, 2019 at 04:54:21PM +0100, Will Deacon wrote:
> > > Hi Marco,
> > >
> > > On Fri, Sep 20, 2019 at 04:18:57PM +0200, Marco Elver wrote:
> > > > We would like to share a new data-race detector for the Linux kernel:
> > > > Kernel Concurrency Sanitizer (KCSAN) --
> > > > https://github.com/google/ktsan/wiki/KCSAN  (Details:
> > > > https://github.com/google/ktsan/blob/kcsan/Documentation/dev-tools/kcsan.rst)
> > > >
> > > > To those of you who we mentioned at LPC that we're working on a
> > > > watchpoint-based KTSAN inspired by DataCollider [1], this is it (we
> > > > renamed it to KCSAN to avoid confusion with KTSAN).
> > > > [1] http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf
> > >
> > > Oh, spiffy!
> > >
> > > > In the coming weeks we're planning to:
> > > > * Set up a syzkaller instance.
> > > > * Share the dashboard so that you can see the races that are found.
> > > > * Attempt to send fixes for some races upstream (if you find that the
> > > > kcsan-with-fixes branch contains an important fix, please feel free to
> > > > point it out and we'll prioritize that).
> > >
> > > Curious: do you take into account things like alignment and/or access size
> > > when looking at READ_ONCE/WRITE_ONCE? Perhaps you could initially prune
> > > naturally aligned accesses for which __native_word() is true?
> > >
> > > > There are a few open questions:
> > > > * The big one: most of the reported races are due to unmarked
> > > > accesses; prioritization or pruning of races to focus initial efforts
> > > > to fix races might be required. Comments on how best to proceed are
> > > > welcome. We're aware that these are issues that have recently received
> > > > attention in the context of the LKMM
> > > > (https://lwn.net/Articles/793253/).
> > >
> > > This one is tricky. What I think we need to avoid is an onslaught of
> > > patches adding READ_ONCE/WRITE_ONCE without a concrete analysis of the
> > > code being modified. My worry is that Joe Developer is eager to get their
> > > first patch into the kernel, so runs this tool and starts spamming
> > > maintainers with these things to the point that they start ignoring KCSAN
> > > reports altogether because of the time they take up.
> > >
> > > I suppose one thing we could do is to require each new READ_ONCE/WRITE_ONCE
> > > to have a comment describing the racy access, a bit like we do for memory
> > > barriers. Another possibility would be to use atomic_t more widely if
> > > there is genuine concurrency involved.
> > >
> >
> > Instead of commenting READ_ONCE/WRITE_ONCE()s, how about adding
> > anotations for data fields/variables that might be accessed without
> > holding a lock? Because if all accesses to a variable are protected by
> > proper locks, we mostly don't need to worry about data races caused by
> > not using READ_ONCE/WRITE_ONCE(). Bad things happen when we write to a
> > variable using locks but read it outside a lock critical section for
> > better performance, for example, rcu_node::qsmask. I'm thinking so maybe
> > we can introduce a new annotation similar to __rcu, maybe call it
> > __lockfree ;-) as follow:
> >
> >         struct rcu_node {
> >                 ...
> >                 unsigned long __lockfree qsmask;
> >                 ...
> >         }
> >
> > , and __lockfree indicates that by design the maintainer of this data
> > structure or variable believe there will be accesses outside lock
> > critical sections. Note that not all accesses to __lockfree field, need
> > to be READ_ONCE/WRITE_ONCE(), if the developer manages to build a
> > complex but working wake/wait state machine so that it could not be
> > accessed in the same time, READ_ONCE()/WRITE_ONCE() is not needed.
> >
> > If we have such an annotation, I think it won't be hard for configuring
> > KCSAN to only examine accesses to variables with this annotation. Also
> > this annotation could help other checkers in the future.
> >
> > If KCSAN (at the least the upstream version) only check accesses with
> > such an anotation, "spamming with KCSAN warnings/fixes" will be the
> > choice of each maintainer ;-)
> >
> > Thoughts?
> 
> But doesn't this defeat the main goal of any race detector -- finding
> concurrent accesses to complex data structures, e.g. forgotten
> spinlock around rbtree manipulation? Since rbtree is not meant to
> concurrent accesses, it won't have __lockfree annotation, and thus we
> will ignore races on it...

Maybe, but for forgotten locks detection, we already have lockdep and
also sparse can help a little. Having a __lockfree annotation could be
benefical for KCSAN to focus on checking the accesses whose race
conditions could only be detected by KCSAN at this time. I think this
could help KCSAN find problem more easily (and fast).

Out of curiosity, does KCSAN ever find a problem with forgotten locks
involved? I didn't see any in the -with-fixes branch (that's
understandable, given the seriousness, the fixes of this kind of
problems could already be submitted to upstream once KCSAN found it.)

Regards,
Boqun

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ