linux-kernel - Re: [PATCH -next] lib: disable KCSAN for XArray

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200305213946.GL2935@paulmck-ThinkPad-P72>
Date:   Thu, 5 Mar 2020 13:39:46 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Qian Cai <cai@....pw>, elver@...gle.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next] lib: disable KCSAN for XArray

On Thu, Mar 05, 2020 at 07:18:31AM -0800, Matthew Wilcox wrote:
> On Wed, Mar 04, 2020 at 06:10:21AM -0800, Paul E. McKenney wrote:
> > On Tue, Mar 03, 2020 at 08:33:56PM -0800, Matthew Wilcox wrote:
> > > On Tue, Mar 03, 2020 at 08:05:15PM -0800, Paul E. McKenney wrote:
> > > > On Tue, Mar 03, 2020 at 07:33:29PM -0800, Matthew Wilcox wrote:
> > > > > On Tue, Mar 03, 2020 at 10:15:51PM -0500, Qian Cai wrote:
> > > > > > Functions like xas_find_marked(), xas_set_mark(), and xas_clear_mark()
> > > > > > could happen concurrently result in data races, but those operate only
> > > > > > on a single bit that are pretty much harmless. For example,
> > > > > 
> > > > > Those aren't data races.  The writes are protected by a spinlock and the
> > > > > reads by the RCU read lock.  If the tool can't handle RCU protection,
> > > > > it's not going to be much use.
> > > > 
> > > > Would KCSAN's ASSERT_EXCLUSIVE_BITS() help here?
> > > 
> > > I'm quite lost in the sea of new macros that have been added to help
> > > with KCSAN.  It doesn't help that they're in -next somewhere that I
> > > can't find, and not in mainline yet.  Is there documentation somewhere?
> > 
> > Yes, there is documentation.  In -next somewhere.  :-/
> 
> Looking at the documentation link that Marco kindly provided, I don't
> think ASSERT_EXCLUSIVE_BITS is the correct annotation to make.  The bits
> in question are being accessed in parallel, it's just that we're willing
> to accept a certain degree of inaccuracy.
> 
> > > struct xa_node {
> > >         unsigned char   shift;          /* Bits remaining in each slot */
> > >         unsigned char   offset;         /* Slot offset in parent */
> > >         unsigned char   count;          /* Total entry count */
> > >         unsigned char   nr_values;      /* Value entry count */
> > >         struct xa_node __rcu *parent;   /* NULL at top of tree */
> > >         struct xarray   *array;         /* The array we belong to */
> > >         union {
> > >                 struct list_head private_list;  /* For tree user */
> > >                 struct rcu_head rcu_head;       /* Used when freeing node */
> > >         };
> > >         void __rcu      *slots[XA_CHUNK_SIZE];
> > >         union {
> > >                 unsigned long   tags[XA_MAX_MARKS][XA_MARK_LONGS];
> > >                 unsigned long   marks[XA_MAX_MARKS][XA_MARK_LONGS];
> > >         };
> > > };
> > > 
> > > 'slots' may be modified with the xa_lock
> > > held, and simultaneously read by an RCU reader.  Ditto 'tags'/'marks'.
> > 
> > KCSAN expects that accesses to the ->parent field should be marked.
> > But if ->parent is always accessed via things like rcu_dereference()
> > and rcu_assign_pointer() (guessing based on the __rcu), then KCSAN
> > won't complain.
> [...]
> > The situation with ->slots is the same as that for ->parent.
> > 
> > KCSAN expects accesses to the ->tags[] and ->marks[] arrays to be marked.
> > However, the default configuration of KCSAN asks only that the reads
> > be marked.  (Within RCU, I instead configure KCSAN so that it asks that
> > both be marked, but it is of course your choice within your code.)
> > 
> > > The RCU readers are prepared for what they see to be inconsistent --
> > > a fact of life when dealing with RCU!  So in a sense, yes, there is a
> > > race there.  But it's a known, accepted race, and that acceptance is
> > > indicated by the fact that the RCU lock is held.  Does there need to be
> > > more annotation here?  Or is there an un-noticed bug that the tool is
> > > legitimately pointing out?
> > 
> > The answer to both questions is "maybe", depending on the code using
> > the values read.  Yes, it would be nice if KCSAN could figure out the
> > difference, but there are limits to what a tool can do.  And things
> > are sometimes no-obvious, for example:
> 
> I require the following properties from this array of bits:
> 
>  - If a bit was set before and after the modification, it must be seen to
>    be set.
>  - If a bit was clear before and after the modification, it must be seen to
>    be clear.
>  - If a bit is modified, it may be seen as set or clear.
> 
> I have found three locations where we use the ->marks array:
> 
> 1.
>                         unsigned long data = *addr & (~0UL << offset);
>                         if (data)
>                                 return __ffs(data);
> 
> 2.
>         return find_next_bit(addr, XA_CHUNK_SIZE, offset);
> 3.
>         return test_bit(offset, node_marks(node, mark));
> 
> The modifications -- all done with the spinlock held -- use the non-atomic
> bitops:
>         return __test_and_set_bit(offset, node_marks(node, mark));
>         return __test_and_clear_bit(offset, node_marks(node, mark));
>         bitmap_fill(node_marks(node, mark), XA_CHUNK_SIZE);
> (that last one doesn't really count -- it's done prior to placing the node
> in the tree)
> 
> The first read seems straightforward; I can place a READ_ONCE around
> *addr.  The second & third reads are rather less straightforward.
> find_next_bit() and test_bit() are common code and use plain loads today.

Yes, those last two are a bit annoying, aren't they?  I guess the first
thing would be placing READ_ONCE() inside them, and if that results in
regressions, have an alternative API for concurrent access?

> > Your mileage may vary, of course.  For one thing, your actuarial
> > statistics are quite likley significantly more favorable than are mine.
> > Not that mine are at all bad, particularly by the standards of a century
> > or two ago.  ;-)
> 
> I don't know ... my preferred form of exercise takes me into the
> harm's way of cars, so while I've significantly lowered my risk of
> dying of cardiovascular disease, I'm much more likely to be killed by
> an inattentive human being!  Bring on our robot cars ...

I did my 40 years using that mode of transport.  But my reflexes are
no longer up to the task.  So I have no advice for you on that one!
Other than to question your trust in robot cars whose code was produced
by human developers!  ;-)

							Thanx, Paul