lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45a59f35-1e86-67a3-26fc-51fd4a4798e0@alu.unizg.hr>
Date:   Mon, 18 Sep 2023 17:33:08 +0200
From:   Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To:     Yury Norov <yury.norov@...il.com>
Cc:     Jan Kara <jack@...e.cz>, Philipp Stanner <pstanner@...hat.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Matthew Wilcox <willy@...radead.org>, Chris Mason <clm@...com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH v1 1/1] xarray: fix the data-race in xas_find_chunk() by
 using READ_ONCE()

On 9/18/23 16:59, Yury Norov wrote:
> On Mon, Sep 18, 2023 at 02:46:02PM +0200, Mirsad Todorovac wrote:
> 
> ...
> 
>> Ah, I see. This is definitely not good. But I managed to fix and test the find_next_bit()
>> family, but this seems that simply
>>
>> -------------------------------------------
>>   include/linux/xarray.h | 8 --------
>>   1 file changed, 8 deletions(-)
>>
>> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
>> index 1715fd322d62..89918b65b00d 100644
>> --- a/include/linux/xarray.h
>> +++ b/include/linux/xarray.h
>> @@ -1718,14 +1718,6 @@ static inline unsigned int xas_find_chunk(struct xa_state *xas, bool advance,
>>          if (advance)
>>                  offset++;
>> -       if (XA_CHUNK_SIZE == BITS_PER_LONG) {
>> -               if (offset < XA_CHUNK_SIZE) {
>> -                       unsigned long data = READ_ONCE(*addr) & (~0UL << offset);
>> -                       if (data)
>> -                               return __ffs(data);
>> -               }
>> -               return XA_CHUNK_SIZE;
>> -       }
>>          return find_next_bit(addr, XA_CHUNK_SIZE, offset);
>>   }
> 
> This looks correct. As per my understanding, the removed part is the
> 1-word bitmap optimization for find_next_bit. If so, it's not needed
> because find_next_bit() bears this optimization itself.
> 
> ...
> 
>> --------------------------------------------------------
>>   lib/find_bit.c | 33 +++++++++++++++++----------------
>>   1 file changed, 17 insertions(+), 16 deletions(-)
>>
>> diff --git a/lib/find_bit.c b/lib/find_bit.c
>> index 32f99e9a670e..56244e4f744e 100644
>> --- a/lib/find_bit.c
>> +++ b/lib/find_bit.c
>> @@ -18,6 +18,7 @@
>>   #include <linux/math.h>
>>   #include <linux/minmax.h>
>>   #include <linux/swab.h>
>> +#include <asm/rwonce.h>
>>   /*
>>    * Common helper for find_bit() function family
>> @@ -98,7 +99,7 @@ out:                                                                          \
>>    */
>>   unsigned long _find_first_bit(const unsigned long *addr, unsigned long size)
>>   {
>> -       return FIND_FIRST_BIT(addr[idx], /* nop */, size);
>> +       return FIND_FIRST_BIT(READ_ONCE(addr[idx]), /* nop */, size);
>>   }
>>   EXPORT_SYMBOL(_find_first_bit);
>>   #endif
> 
> ...
> 
> That doesn't look correct. READ_ONCE() implies that there's another
> thread modifying the bitmap concurrently. This is not the true for
> vast majority of bitmap API users, and I expect that forcing
> READ_ONCE() would affect performance for them.
> 
> Bitmap functions, with a few rare exceptions like set_bit(), are not
> thread-safe and require users to perform locking/synchronization where
> needed.
> 
> If you really need READ_ONCE, I think it's better to implement a new
> flavor of the function(s) separately, like:
>          find_first_bit_read_once()
> 
> Thanks,
> Yury

Hi,

I see the quirk. AFAICS, correct locking would be more expensive than READ_ONCE()
flavour of functions.

Only one has to inspect every line where they are used and see if there is the need
for the *_read_once() version.

I suppose people will not be happy because of the duplication of code. :-(

It is not a lot of work, I will do a PoC code and see if KCSAN still complains.
(Which was the basic reason in the first place for all this, because something changed
data from underneath our fingers ...).

Best regards,
Mirsad


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ