lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <78f45c3c-191b-bd14-3b38-522907d0e24f@redhat.com>
Date:   Tue, 29 Jun 2021 10:40:12 -0400
From:   Waiman Long <llong@...hat.com>
To:     "Xu, Yanfei" <yanfei.xu@...driver.com>,
        Waiman Long <llong@...hat.com>, peterz@...radead.org,
        mingo@...hat.com, boqun.feng@...il.com
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH] locking/mutex: fix the MUTEX_FLAG_HANDOFF bit is cleared
 unexpected

On 6/29/21 5:52 AM, Xu, Yanfei wrote:
>
>
> On 6/29/21 1:57 AM, Waiman Long wrote:
>> [Please note: This e-mail is from an EXTERNAL e-mail address]
>>
>> On 6/28/21 11:51 AM, Yanfei Xu wrote:
>>> When the mutex unlock path is excuted with WAITERS bit and without
>>> HANDOFF bit set, it will wake up the first task in wait_list. If
>>> there are some tasks not in wait_list are stealing lock, it is very
>>> likely successfully due to the task field of lock is NULL and flags
>>> field is non-NULL. Then the HANDOFF bit will be cleared. But if the
>>> HANDOFF bit was just set by the waked task in wait_list, this clearing
>>> is unexcepted.
>>
>> I think you mean "unexpected". Right? Anyway, all the setting and
>
> Right. It's my fault.
>
>> clearing of the HANDOFF bit are atomic. There shouldn't be any
>> unexpected clearing.
>>
>>> __mutex_lock_common __mutex_lock_common
>>>    __mutex_trylock schedule_preempt_disabled
>>>      /*steal lock successfully*/ 
>>> __mutex_set_flag(lock,MUTEX_FLAG_HANDOFF)
>>>      __mutex_trylock_or_owner
>>>        if (task==NULL)
>>>        flags &= ~MUTEX_FLAG_HANDOFF
>>>        atomic_long_cmpxchg_acquire
>>>                                          __mutex_trylock //failed
>>> mutex_optimistic_spin  //failed
>>> schedule_preempt_disabled  //sleep without HANDOFF bit
>>>
>>> So the HANDOFF bit should be set as late as possible, here we defer
>>> it util the task is going to be scheduled.
>>> Signed-off-by: Yanfei Xu <yanfei.xu@...driver.com>
>>> ---
>>>
>>> Hi maintainers,
>>>
>>> I am not very sure if I missed or misunderstanded something, please 
>>> help
>>> to review. Many thanks!
>>>
>>>   kernel/locking/mutex.c | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
>>> index 013e1b08a1bf..e57d920e96bf 100644
>>> --- a/kernel/locking/mutex.c
>>> +++ b/kernel/locking/mutex.c
>>> @@ -1033,17 +1033,17 @@ __mutex_lock_common(struct mutex *lock, long 
>>> state, unsigned int subclass,
>>>               }
>>>
>>>               spin_unlock(&lock->wait_lock);
>>> +
>>> +             if (first)
>>> +                     __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
>>>               schedule_preempt_disabled();
>>>
>>>               /*
>>>                * ww_mutex needs to always recheck its position since 
>>> its waiter
>>>                * list is not FIFO ordered.
>>>                */
>>> -             if (ww_ctx || !first) {
>>> +             if (ww_ctx || !first)
>>>                       first = __mutex_waiter_is_first(lock, &waiter);
>>> -                     if (first)
>>> -                             __mutex_set_flag(lock, 
>>> MUTEX_FLAG_HANDOFF);
>>> -             }
>>>
>>>               set_current_state(state);
>>>               /*
>>
>> In general, I don't mind setting the HANDOFF bit later, but
>> mutex_optimistic_spin() at the end of the loop should only be called
>> after the HANDOFF bit is set. So the logic isn't quite right yet.
>
> Thanks for your reply.
>
> Why the mutex_optimistic_spin should be called after the HANDOFF bit is
> set? I think the HANDOFF bit is only related to unlock path, and the
> mutex_optimistic_spin and __mutex_trylock don't actually use it. (Or I
> missed something? )

The purpose of doing spinning after the HANDOFF bit is set is to 
eliminate the waiter wakeup latency, if possible. Before the HANDOFF bit 
is set, the lock can be easily stolen and there is no point in adding 
pressure to the lock cacheline.


>
> Maybe I described the issue not much clearly. Let me try again.
>
> The HANDOFF bit aims to avoid lock starvation. Lock starvation is
> possible because mutex_lock() allows lock stealing, where a runing( or
> optimistic spinning) task beats the woken waiter to the acquire. So
> maintainer add HANDOFF bit, once the stealing happened, the top-waiter
> will must get the lock at the second wake up due to the HANDOFF bit.
>
> However, the fact is if the stealing happened, the HANDOFF will must be
> clear by the thief task. Hence the top-waiter still might starve at the
> second wake up.
>
I think you have some confusion here. If the HANDOFF bit is set before 
the lock is stolen by an optimistic spinner, lock stealing can't happen 
which is the point of having a HANDOFF bit. Also the clearing of the 
HANDOFF bit isn't done by the task that steal the lock, it is done by 
the current lock holder (i.e. the task that held the lock when the 
HANDOFF bit was set) in the unlock path. It can be a lock stealer that 
stole the lock before the HANDOFF bit is set. Of course, it will be a 
bug if the current code doesn't actually do that.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ