[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <208a49d2-5f29-c48e-206c-260ee3f1d991@huawei.com>
Date: Fri, 15 Apr 2022 15:07:38 +0800
From: "yukuai (C)" <yukuai3@...wei.com>
To: "Li, Ming" <ming4.li@...el.com>, <axboe@...nel.dk>,
<andriy.shevchenko@...ux.intel.com>, <john.garry@...wei.com>,
<ming.lei@...hat.com>
CC: <linux-block@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<yi.zhang@...wei.com>
Subject: Re: [PATCH -next RFC v2 3/8] sbitmap: make sure waitqueues are
balanced
在 2022/04/15 14:31, Li, Ming 写道:
>
>
> On 4/8/2022 3:39 PM, Yu Kuai wrote:
>> Currently, same waitqueue might be woken up continuously:
>>
>> __sbq_wake_up __sbq_wake_up
>> sbq_wake_ptr -> assume 0
>> sbq_wake_ptr -> 0
>> atomic_dec_return
>> atomic_dec_return
>> atomic_cmpxchg -> succeed
>> atomic_cmpxchg -> failed
>> return true
>>
>> __sbq_wake_up
>> sbq_wake_ptr
>> atomic_read(&sbq->wake_index) -> still 0
>> sbq_index_atomic_inc -> inc to 1
>> if (waitqueue_active(&ws->wait))
>> if (wake_index != atomic_read(&sbq->wake_index))
>> atomic_set -> reset from 1 to 0
>> wake_up_nr -> wake up first waitqueue
>> // continue to wake up in first waitqueue
>>
>> What's worse, io hung is possible in theory because wake up might be
>> missed. For example, 2 * wake_batch tags are put, while only wake_batch
>> threads are worken:
>>
>> __sbq_wake_up
>> atomic_cmpxchg -> reset wait_cnt
>> __sbq_wake_up -> decrease wait_cnt
>> ...
>> __sbq_wake_up -> wait_cnt is decreased to 0 again
>> atomic_cmpxchg
>> sbq_index_atomic_inc -> increase wake_index
>> wake_up_nr -> wake up and waitqueue might be empty
>> sbq_index_atomic_inc -> increase again, one waitqueue is skipped
>> wake_up_nr -> invalid wake up because old wakequeue might be empty
>>
>> To fix the problem, refactor to make sure waitqueues will be woken up
>> one by one, and also choose the next waitqueue by the number of threads
>> that are waiting to keep waitqueues balanced.
> Hi, do you think that updating wake_index before atomic_cmpxchg(ws->wait_cnt) also can solve these two problems?
> like this:
Hi,
The first problem is due to sbq_wake_ptr() is using atomic_set() to
update 'wake_index'.
The second problem is due to __sbq_wake_up() is updating 'wait_cnt'
before 'wait_index'.
> __sbq_wake_up()
> {
> ....
> if (wait_cnt <= 0) {
> ret = atomic_cmpxchg(sbq->wake_index, old_wake_index, next_wake_index);
How is the 'next_wake_index' chosen? And the same in sbq_wake_ptr().
> if (ret == old_wake_index) {
> ret = atomic_cmpxchg(ws->wait_cnt, wait_cnt, wake_batch);
If this failed, just return true with 'wake_index' updated? Then the
caller will call this again, so it seems this can't prevent 'wake_index'
updated multiple times, and 'wait_cnt' in the old 'ws' is not updated.
> if (ret == wait_cnt)
> wake_up_nr(ws->wait, wake_batch);
> }
> }
> }
>
> Your solution is picking the waitqueue with the largest waiters_cnt as the next one to be waked up, I think that waitqueue is possible to starve.
> if lots of threads in a same waitqueue stop waiting before sbq wakes them up, it will cause the waiters_cnt of waitqueue is much less than others, looks like sbq_update_wake_index() would never pick this waitqueue. What do you think? is it possible?
It will be possible if adding threads to waitqueues is not balanced, and
I suppose it's not possible after tag premmption is disabled. However,
instead of chosing the waitqueue with largest waiters_cnt, chosing the
next waitqueue with 'waiters_cnt > 0' might be alternative.
Thanks,
Kuai
Powered by blists - more mailing lists