linux-kernel - Re: [PATCH 1/2] lib/find: Make functions safe on changing bitmaps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZSoCt5+DybhpsuGv@yury-ThinkPad>
Date:   Fri, 13 Oct 2023 19:53:43 -0700
From:   Yury Norov <yury.norov@...il.com>
To:     Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
Cc:     Jan Kara <jack@...e.cz>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Matthew Wilcox <willy@...radead.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] lib/find: Make functions safe on changing bitmaps

On Sat, Oct 14, 2023 at 04:21:50AM +0200, Mirsad Goran Todorovac wrote:
> On 10/14/2023 2:15 AM, Yury Norov wrote:
> > Restore LKML
> > 
> > On Thu, Oct 12, 2023 at 02:21:10PM +0200, Jan Kara wrote:
> > > On Wed 11-10-23 11:26:29, Yury Norov wrote:
> > > > Long story short: KCSAN found some potential issues related to how
> > > > people use bitmap API. And instead of working through that issues,
> > > > the following code shuts down KCSAN by applying READ_ONCE() here
> > > > and there.
> > > 
> > > I'm sorry but this is not what the patch does. I'm not sure how to get the
> > > message across so maybe let me start from a different angle:
> > > 
> > > Bitmaps are perfectly fine to be used without any external locking if
> > > only atomic bit ops (set_bit, clear_bit, test_and_{set/clear}_bit) are
> > > used. This is a significant performance gain compared to using a spinlock
> > > or other locking and people do this for a long time. I hope we agree on
> > > that.
> > > 
> > > Now it is also common that you need to find a set / clear bit in a bitmap.
> > > To maintain lockless protocol and deal with races people employ schemes
> > > like (the dumbest form):
> > > 
> > > 	do {
> > > 		bit = find_first_bit(bitmap, n);
> > > 		if (bit >= n)
> > > 			abort...
> > > 	} while (!test_and_clear_bit(bit, bitmap));
> > > 
> > > So the code loops until it finds a set bit that is successfully cleared by
> > > it. This is perfectly fine and safe lockless code and such use should be
> > > supported. Agreed?
> > 
> > Great example. When you're running non-atomic functions concurrently,
> > the result may easily become incorrect, and this is what you're
> > demonstrating here.
> > 
> > Regarding find_first_bit() it means that:
> >   - it may erroneously return unset bit;
> >   - it may erroneously return non-first set bit;
> >   - it may erroneously return no bits for non-empty bitmap.
> > 
> > Effectively it means that find_first bit may just return a random number.
> > 
> > Let's take another example:
> > 
> > 	do {
> > 		bit = get_random_number();
> > 		if (bit >= n)
> > 			abort...
> > 	} while (!test_and_clear_bit(bit, bitmap));
> > 
> > When running concurrently, the difference between this and your code
> > is only in probability of getting set bit somewhere from around the
> > beginning of bitmap.
> > 
> > The key point is that find_bit() may return undef even if READ_ONCE() is
> > used. If bitmap gets changed anytime in the process, the result becomes
> > invalid. It may happen even after returning from find_first_bit().
> > 
> > And if my understanding correct, your code is designed in the
> > assumption that find_first_bit() may return garbage, so handles it
> > correctly.
> > 
> > > *Except* that the above actually is not safe due to find_first_bit()
> > > implementation and KCSAN warns about that. The problem is that:
> > > 
> > > Assume *addr == 1
> > > CPU1			CPU2
> > > find_first_bit(addr, 64)
> > >    val = *addr;
> > >    if (val) -> true
> > > 			clear_bit(0, addr)
> > >      val = *addr -> compiler decided to refetch addr contents for whatever
> > > 		   reason in the generated assembly
> > >      __ffs(val) -> now executed for value 0 which has undefined results.
> > 
> > Yes, __ffs(0) is undef. But the whole function is undef when accessing
> > bitmap concurrently.
> > 
> > > And the READ_ONCE() this patch adds prevents the compiler from adding the
> > > refetching of addr into the assembly.
> > 
> > That's true. But it doesn't improve on the situation. It was an undef
> > before, and it's undef after, but a 2% slower undef.
> > 
> > Now on that KCSAN warning. If I understand things correctly, for the
> > example above, KCSAN warning is false-positive, because you're
> > intentionally running lockless.
> > 
> > But for some other people it may be a true error, and now they'll have
> > no chance to catch it if KCSAN is forced to ignore find_bit() entirely.
> > 
> > We've got the whole class of lockless algorithms that allow safe concurrent
> > access to the memory. And now that there's a tool that searches for them
> > (concurrent accesses), we need to have an option to somehow teach it
> > to suppress irrelevant warnings. Maybe something like this?
> > 
> >          lockless_algorithm_begin(bitmap, bitmap_size(nbits));
> > 	do {
> > 		bit = find_first_bit(bitmap, nbits);
> > 		if (bit >= nbits)
> > 			break;
> > 	} while (!test_and_clear_bit(bit, bitmap));
> >          lockless_algorithm_end(bitmap, bitmap_size(nbits));
> > 
> > And, of course, as I suggested a couple iterations ago, you can invent
> > a thread-safe version of find_bit(), that would be perfectly correct
> > for lockless use:
> > 
> >   unsigned long _find_and_clear_bit(volatile unsigned long *addr, unsigned long size)
> >   {
> >          unsigned long bit = 0;
> >          while (!test_and_clear_bit(bit, bitmap) {
> >                  bit = FIND_FIRST_BIT(addr[idx], /* nop */, size);
> >                  if (bit >= size)
> >                          return size;
> >          }
> > 
> >          return bit;
> >   }
> 
> Hi, Yuri,
> 
> But the code above effectively does the same as the READ_ONCE() macro
> as defined in rwonce.h:
> 
> #ifndef __READ_ONCE
> #define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
> #endif
> 
> #define READ_ONCE(x)							\
> ({									\
> 	compiletime_assert_rwonce_type(x);				\
> 	__READ_ONCE(x);							\
> })
> 
> Both uses only prevent the funny stuff the compiler might have done to the
> read of the addr[idx], there's no black magic in READ_ONCE().
> 
> Both examples would probably result in the same assembly and produce the
> same 2% slowdown ...
> 
> Only you declare volatile in one place, and READ_ONCE() in each read, but
> this will only compile a bit slower and generate the same machine code.

The difference is that find_and_clear_bit() has a semantics of
atomic operation. Those who will decide to use it will also anticipate
associate downsides. And other hundreds (or thousands) users of
non-atomic find_bit() functions will not have to pay extra buck
for unneeded atomicity.

Check how 'volatile' is used in test_and_clear_bit(), and consider
find_and_clear_bit() as a wrapper around test_and_clear_bit().

In other words, this patch suggests to make find_bit() thread-safe by
using READ_ONCE(), and it doesn't work. find_and_clear_bit(), on the
other hand, is simply a wrapper around test_and_clear_bit(), and
doesn't imply any new restriction that test_and_clear_bit() doesn't.

Think of it as an optimized version of:
         while (bit < nbits && !test_and_clear_bit(bit, bitmap)
                bit++;

If you think it's worth to try in your code, I can send a patch for
you.

Thanks,
Yury