[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45a59f35-1e86-67a3-26fc-51fd4a4798e0@alu.unizg.hr>
Date: Mon, 18 Sep 2023 17:33:08 +0200
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: Yury Norov <yury.norov@...il.com>
Cc: Jan Kara <jack@...e.cz>, Philipp Stanner <pstanner@...hat.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Matthew Wilcox <willy@...radead.org>, Chris Mason <clm@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
Josef Bacik <josef@...icpanda.com>,
David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH v1 1/1] xarray: fix the data-race in xas_find_chunk() by
using READ_ONCE()
On 9/18/23 16:59, Yury Norov wrote:
> On Mon, Sep 18, 2023 at 02:46:02PM +0200, Mirsad Todorovac wrote:
>
> ...
>
>> Ah, I see. This is definitely not good. But I managed to fix and test the find_next_bit()
>> family, but this seems that simply
>>
>> -------------------------------------------
>> include/linux/xarray.h | 8 --------
>> 1 file changed, 8 deletions(-)
>>
>> diff --git a/include/linux/xarray.h b/include/linux/xarray.h
>> index 1715fd322d62..89918b65b00d 100644
>> --- a/include/linux/xarray.h
>> +++ b/include/linux/xarray.h
>> @@ -1718,14 +1718,6 @@ static inline unsigned int xas_find_chunk(struct xa_state *xas, bool advance,
>> if (advance)
>> offset++;
>> - if (XA_CHUNK_SIZE == BITS_PER_LONG) {
>> - if (offset < XA_CHUNK_SIZE) {
>> - unsigned long data = READ_ONCE(*addr) & (~0UL << offset);
>> - if (data)
>> - return __ffs(data);
>> - }
>> - return XA_CHUNK_SIZE;
>> - }
>> return find_next_bit(addr, XA_CHUNK_SIZE, offset);
>> }
>
> This looks correct. As per my understanding, the removed part is the
> 1-word bitmap optimization for find_next_bit. If so, it's not needed
> because find_next_bit() bears this optimization itself.
>
> ...
>
>> --------------------------------------------------------
>> lib/find_bit.c | 33 +++++++++++++++++----------------
>> 1 file changed, 17 insertions(+), 16 deletions(-)
>>
>> diff --git a/lib/find_bit.c b/lib/find_bit.c
>> index 32f99e9a670e..56244e4f744e 100644
>> --- a/lib/find_bit.c
>> +++ b/lib/find_bit.c
>> @@ -18,6 +18,7 @@
>> #include <linux/math.h>
>> #include <linux/minmax.h>
>> #include <linux/swab.h>
>> +#include <asm/rwonce.h>
>> /*
>> * Common helper for find_bit() function family
>> @@ -98,7 +99,7 @@ out: \
>> */
>> unsigned long _find_first_bit(const unsigned long *addr, unsigned long size)
>> {
>> - return FIND_FIRST_BIT(addr[idx], /* nop */, size);
>> + return FIND_FIRST_BIT(READ_ONCE(addr[idx]), /* nop */, size);
>> }
>> EXPORT_SYMBOL(_find_first_bit);
>> #endif
>
> ...
>
> That doesn't look correct. READ_ONCE() implies that there's another
> thread modifying the bitmap concurrently. This is not the true for
> vast majority of bitmap API users, and I expect that forcing
> READ_ONCE() would affect performance for them.
>
> Bitmap functions, with a few rare exceptions like set_bit(), are not
> thread-safe and require users to perform locking/synchronization where
> needed.
>
> If you really need READ_ONCE, I think it's better to implement a new
> flavor of the function(s) separately, like:
> find_first_bit_read_once()
>
> Thanks,
> Yury
Hi,
I see the quirk. AFAICS, correct locking would be more expensive than READ_ONCE()
flavour of functions.
Only one has to inspect every line where they are used and see if there is the need
for the *_read_once() version.
I suppose people will not be happy because of the duplication of code. :-(
It is not a lot of work, I will do a PoC code and see if KCSAN still complains.
(Which was the basic reason in the first place for all this, because something changed
data from underneath our fingers ...).
Best regards,
Mirsad
Powered by blists - more mailing lists