lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Jul 2022 21:25:26 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Jan Kara <jack@...e.cz>, Yu Kuai <yukuai1@...weicloud.com>
Cc:     axboe@...nel.dk, asml.silence@...il.com, osandov@...com,
        kbusch@...nel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, yi.zhang@...wei.com
Subject: Re: [PATCH RFC v3 1/3] sbitmap: fix that same waitqueue can be woken
 up continuously

Hi!

在 2022/07/11 22:20, Jan Kara 写道:
> On Sun 10-07-22 12:21:58, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> __sbq_wake_up		__sbq_wake_up
>>   sbq_wake_ptr -> assume	0
>> 			 sbq_wake_ptr -> 0
>>   atomic_dec_return
>> 			atomic_dec_return
>>   atomic_cmpxchg -> succeed
>> 			 atomic_cmpxchg -> failed
>> 			  return true
>>
>> 			__sbq_wake_up
>> 			 sbq_wake_ptr
>> 			  atomic_read(&sbq->wake_index) -> still 0
>>   sbq_index_atomic_inc -> inc to 1
>> 			  if (waitqueue_active(&ws->wait))
>> 			   if (wake_index != atomic_read(&sbq->wake_index))
>> 			    atomic_set -> reset from 1 to 0
>>   wake_up_nr -> wake up first waitqueue
>> 			    // continue to wake up in first waitqueue
>>
>> Fix the problem by using atomic_cmpxchg() instead of atomic_set()
>> to update 'wake_index'.
>>
>> Fixes: 417232880c8a ("sbitmap: Replace cmpxchg with xchg")
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
> 
> I don't think this patch is really needed after the following patches.  As
> I see it, wake_index is just a performance optimization (plus a fairness
> improvement) but in principle the code in sbq_wake_ptr() is always prone to
> races as the waitqueue it returns needn't have any waiters by the time we
> return. So for correctness the check-and-retry loop needs to happen at
> higher level than inside sbq_wake_ptr() and occasional wrong setting of
> wake_index will result only in a bit of unfairness and more scanning
> looking for suitable waitqueue but I don't think that really justifies the
> cost of atomic operations in cmpxchg loop...

It's right this patch just improve fairness. However, in hevyload tests
I found that the 'wrong setting of wake_index' can happen frequently,
for consequence, some waitqueue can be empty while some waitqueue have
a lot of waiters.

There shoud be lots of work to fix unfairness throughly, I can remove
this patch for now.

Thanks,
Kuai
> 
> 								Honza
>> ---
>>   lib/sbitmap.c | 15 ++++++++++-----
>>   1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
>> index 29eb0484215a..b46fce1beb3a 100644
>> --- a/lib/sbitmap.c
>> +++ b/lib/sbitmap.c
>> @@ -579,19 +579,24 @@ EXPORT_SYMBOL_GPL(sbitmap_queue_min_shallow_depth);
>>   
>>   static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq)
>>   {
>> -	int i, wake_index;
>> +	int i, wake_index, old_wake_index;
>>   
>> +again:
>>   	if (!atomic_read(&sbq->ws_active))
>>   		return NULL;
>>   
>> -	wake_index = atomic_read(&sbq->wake_index);
>> +	old_wake_index = wake_index = atomic_read(&sbq->wake_index);
>>   	for (i = 0; i < SBQ_WAIT_QUEUES; i++) {
>>   		struct sbq_wait_state *ws = &sbq->ws[wake_index];
>>   
>>   		if (waitqueue_active(&ws->wait)) {
>> -			if (wake_index != atomic_read(&sbq->wake_index))
>> -				atomic_set(&sbq->wake_index, wake_index);
>> -			return ws;
>> +			if (wake_index == old_wake_index)
>> +				return ws;
>> +
>> +			if (atomic_cmpxchg(&sbq->wake_index, old_wake_index,
>> +					   wake_index) == old_wake_index)
>> +				return ws;
>> +			goto again;
>>   		}
>>   
>>   		wake_index = sbq_index_inc(wake_index);
>> -- 
>> 2.31.1
>>

Powered by blists - more mailing lists