[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8241576e-ac92-1a0b-b1ad-2372b61759ea@windriver.com>
Date: Wed, 30 Jun 2021 14:20:42 +0800
From: "Xu, Yanfei" <yanfei.xu@...driver.com>
To: Waiman Long <llong@...hat.com>, peterz@...radead.org,
mingo@...hat.com, boqun.feng@...il.com
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH] locking/mutex: fix the MUTEX_FLAG_HANDOFF bit is cleared
unexpected
On 6/29/21 11:18 PM, Waiman Long wrote:
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
> On 6/29/21 10:40 AM, Waiman Long wrote:
>> On 6/29/21 5:52 AM, Xu, Yanfei wrote:
>>>
>>>
>>> On 6/29/21 1:57 AM, Waiman Long wrote:
>>>> [Please note: This e-mail is from an EXTERNAL e-mail address]
>>>>
>>>> On 6/28/21 11:51 AM, Yanfei Xu wrote:
>>>>> When the mutex unlock path is excuted with WAITERS bit and without
>>>>> HANDOFF bit set, it will wake up the first task in wait_list. If
>>>>> there are some tasks not in wait_list are stealing lock, it is very
>>>>> likely successfully due to the task field of lock is NULL and flags
>>>>> field is non-NULL. Then the HANDOFF bit will be cleared. But if the
>>>>> HANDOFF bit was just set by the waked task in wait_list, this clearing
>>>>> is unexcepted.
>>>>
>>>> I think you mean "unexpected". Right? Anyway, all the setting and
>>>
>>> Right. It's my fault.
>>>
>>>> clearing of the HANDOFF bit are atomic. There shouldn't be any
>>>> unexpected clearing.
>>>>
>>>>> __mutex_lock_common __mutex_lock_common
>>>>> __mutex_trylock schedule_preempt_disabled
>>>>> /*steal lock successfully*/
>>>>> __mutex_set_flag(lock,MUTEX_FLAG_HANDOFF)
>>>>> __mutex_trylock_or_owner
>>>>> if (task==NULL)
>>>>> flags &= ~MUTEX_FLAG_HANDOFF
>>>>> atomic_long_cmpxchg_acquire
>>>>> __mutex_trylock //failed
>>>>> mutex_optimistic_spin //failed
>>>>> schedule_preempt_disabled //sleep without HANDOFF bit
>>>>>
>>>>> So the HANDOFF bit should be set as late as possible, here we defer
>>>>> it util the task is going to be scheduled.
>>>>> Signed-off-by: Yanfei Xu <yanfei.xu@...driver.com>
>>>>> ---
>>>>>
>>>>> Hi maintainers,
>>>>>
>>>>> I am not very sure if I missed or misunderstanded something, please
>>>>> help
>>>>> to review. Many thanks!
>>>>>
>>>>> kernel/locking/mutex.c | 8 ++++----
>>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
>>>>> index 013e1b08a1bf..e57d920e96bf 100644
>>>>> --- a/kernel/locking/mutex.c
>>>>> +++ b/kernel/locking/mutex.c
>>>>> @@ -1033,17 +1033,17 @@ __mutex_lock_common(struct mutex *lock,
>>>>> long state, unsigned int subclass,
>>>>> }
>>>>>
>>>>> spin_unlock(&lock->wait_lock);
>>>>> +
>>>>> + if (first)
>>>>> + __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
>>>>> schedule_preempt_disabled();
>>>>>
>>>>> /*
>>>>> * ww_mutex needs to always recheck its position
>>>>> since its waiter
>>>>> * list is not FIFO ordered.
>>>>> */
>>>>> - if (ww_ctx || !first) {
>>>>> + if (ww_ctx || !first)
>>>>> first = __mutex_waiter_is_first(lock, &waiter);
>>>>> - if (first)
>>>>> - __mutex_set_flag(lock,
>>>>> MUTEX_FLAG_HANDOFF);
>>>>> - }
>>>>>
>>>>> set_current_state(state);
>>>>> /*
>>>>
>>>> In general, I don't mind setting the HANDOFF bit later, but
>>>> mutex_optimistic_spin() at the end of the loop should only be called
>>>> after the HANDOFF bit is set. So the logic isn't quite right yet.
>>>
>>> Thanks for your reply.
>>>
>>> Why the mutex_optimistic_spin should be called after the HANDOFF bit is
>>> set? I think the HANDOFF bit is only related to unlock path, and the
>>> mutex_optimistic_spin and __mutex_trylock don't actually use it. (Or I
>>> missed something? )
>>
>> The purpose of doing spinning after the HANDOFF bit is set is to
>> eliminate the waiter wakeup latency, if possible. Before the HANDOFF
>> bit is set, the lock can be easily stolen and there is no point in
>> adding pressure to the lock cacheline.
>>
>>
>>>
>>> Maybe I described the issue not much clearly. Let me try again.
>>>
>>> The HANDOFF bit aims to avoid lock starvation. Lock starvation is
>>> possible because mutex_lock() allows lock stealing, where a runing( or
>>> optimistic spinning) task beats the woken waiter to the acquire. So
>>> maintainer add HANDOFF bit, once the stealing happened, the top-waiter
>>> will must get the lock at the second wake up due to the HANDOFF bit.
>>>
>>> However, the fact is if the stealing happened, the HANDOFF will must be
>>> clear by the thief task. Hence the top-waiter still might starve at the
>>> second wake up.
>>>
>> I think you have some confusion here. If the HANDOFF bit is set before
>> the lock is stolen by an optimistic spinner, lock stealing can't
>> happen which is the point of having a HANDOFF bit. Also the clearing
>> of the HANDOFF bit isn't done by the task that steal the lock, it is
>> done by the current lock holder (i.e. the task that held the lock when
>> the HANDOFF bit was set) in the unlock path. It can be a lock stealer
>> that stole the lock before the HANDOFF bit is set. Of course, it will
>> be a bug if the current code doesn't actually do that.
>
> Oh, you are right. The current code doesn't actually prevent lock
> stealer from actually stealing the lock in the special case that the
> lock is in the unlock state when the HANDOFF bit is set. In this case,
How about setting the HANDOFF bit before the top-waiter first give up
cpu and fall asleep. Then It must can get the lock after being woken up,
and there is no chance happen stealing lock. And I sent a v2 with this.
> it is free for all and whoever gets the lock will also clear the the
> HANDOFF bit. The comment in __mutex_trylock_or_owner() about "We set the
> HANDOFF bit" isn't quite right.
>
> One way to address this issue is to enforce the rule that non-first
> waiter can't steal the lock when the HANDOFF bit is set. That probably
> eliminates the need of a separate PICKUP bit.
>
> Alternatively, we can let this state happens similar to what your patch
> proposes. However, we should clearly document in the code this special
> race condition.
Yes, the document is obsolete after commit e274795ea7b7 ("locking/mutex:
Fix mutex handoff").
Thanks,
Yanfei
>
> Cheers,
> Longman
>
>
Powered by blists - more mailing lists