lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <496b87d9-c89d-3d4f-8ba8-5bb706de7fd0@huaweicloud.com>
Date:   Wed, 7 Sep 2022 09:12:38 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Keith Busch <kbusch@...nel.org>, Yu Kuai <yukuai1@...weicloud.com>
Cc:     jack@...e.cz, axboe@...nel.dk, osandov@...com,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        yi.zhang@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH] sbitmap: fix possible io hung due to lost wakeup

Hi,

在 2022/09/07 5:27, Keith Busch 写道:
> On Wed, Aug 03, 2022 at 08:15:04PM +0800, Yu Kuai wrote:
>>   	wait_cnt = atomic_dec_return(&ws->wait_cnt);
>> -	if (wait_cnt <= 0) {
>> -		int ret;
>> +	/*
>> +	 * For concurrent callers of this, callers should call this function
>> +	 * again to wakeup a new batch on a different 'ws'.
>> +	 */
>> +	if (wait_cnt < 0 || !waitqueue_active(&ws->wait))
>> +		return true;
> 
> If wait_cnt is '0', but the waitqueue_active happens to be false due to racing
> with add_wait_queue(), this returns true so the caller will retry. The next
> atomic_dec will set the current waitstate wait_cnt < 0, which also forces an
> early return true. When does the wake up happen, or wait_cnt and wait_index get
> updated in that case?

If waitqueue becomes empty, then concurrent callers can go on:

__sbq_wake_up
  sbq_wake_ptr
   for (i = 0; i < SBQ_WAIT_QUEUES; i++)
    if (waitqueue_active(&ws->wait)) -> only choose the active waitqueue

If waitqueue is not empty, it is the same with or without this patch,
concurrent caller will have to wait for the one who wins the race:

Before:
  __sbq_wake_up
   atomic_cmpxchg -> win the race
   sbq_index_atomic_inc ->  concurrent callers can go

After:
  __sbq_wake_up
  wake_up_nr -> concurrent callers can go on if waitqueue become empty
  atomic_dec_return -> return 0
  sbq_index_atomic_inc

Thanks,
Kuai
>    
>> -		wake_batch = READ_ONCE(sbq->wake_batch);
>> +	if (wait_cnt > 0)
>> +		return false;
>>   
>> -		/*
>> -		 * Pairs with the memory barrier in sbitmap_queue_resize() to
>> -		 * ensure that we see the batch size update before the wait
>> -		 * count is reset.
>> -		 */
>> -		smp_mb__before_atomic();
>> +	wake_batch = READ_ONCE(sbq->wake_batch);
>>   
>> -		/*
>> -		 * For concurrent callers of this, the one that failed the
>> -		 * atomic_cmpxhcg() race should call this function again
>> -		 * to wakeup a new batch on a different 'ws'.
>> -		 */
>> -		ret = atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wake_batch);
>> -		if (ret == wait_cnt) {
>> -			sbq_index_atomic_inc(&sbq->wake_index);
>> -			wake_up_nr(&ws->wait, wake_batch);
>> -			return false;
>> -		}
>> +	/*
>> +	 * Wake up first in case that concurrent callers decrease wait_cnt
>> +	 * while waitqueue is empty.
>> +	 */
>> +	wake_up_nr(&ws->wait, wake_batch);
>>   
>> -		return true;
>> -	}
>> +	/*
>> +	 * Pairs with the memory barrier in sbitmap_queue_resize() to
>> +	 * ensure that we see the batch size update before the wait
>> +	 * count is reset.
>> +	 *
>> +	 * Also pairs with the implicit barrier between decrementing wait_cnt
>> +	 * and checking for waitqueue_active() to make sure waitqueue_active()
>> +	 * sees result of the wakeup if atomic_dec_return() has seen the result
>> +	 * of atomic_set().
>> +	 */
>> +	smp_mb__before_atomic();
>> +
>> +	/*
>> +	 * Increase wake_index before updating wait_cnt, otherwise concurrent
>> +	 * callers can see valid wait_cnt in old waitqueue, which can cause
>> +	 * invalid wakeup on the old waitqueue.
>> +	 */
>> +	sbq_index_atomic_inc(&sbq->wake_index);
>> +	atomic_set(&ws->wait_cnt, wake_batch);
>>   
>>   	return false;
>>   }
>> -- 
>> 2.31.1
>>
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ