lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <948d0a25-63fa-7234-57a7-0a9209169a51@oracle.com>
Date:   Fri, 22 Jul 2022 07:09:36 -0500
From:   john.p.donnelly@...cle.com
To:     Waiman Long <longman@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>
Cc:     linux-kernel@...r.kernel.org, Hillf Danton <hdanton@...a.com>,
        Mel Gorman <mgorman@...hsingularity.net>
Subject: Re: [PATCH v2] locking/rwsem: Allow slowpath writer to ignore handoff
 bit if not set by first waiter

On 7/1/22 9:37 AM, john.p.donnelly@...cle.com wrote:
> On 6/22/22 3:04 PM, Waiman Long wrote:
>> With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
>> consistent"), the writer that sets the handoff bit can be interrupted
>> out without clearing the bit if the wait queue isn't empty. This disables
>> reader and writer optimistic lock spinning and stealing.
>>
>> Now if a non-first writer in the queue is somehow woken up or a new
>> waiter enters the slowpath, it can't acquire the lock.  This is not the
>> case before commit d257cc8cb8d5 as the writer that set the handoff bit
>> will clear it when exiting out via the out_nolock path. This is less
>> efficient as the busy rwsem stays in an unlock state for a longer time.
>>
>> In some cases, this new behavior may cause lockups as shown in [1] and
>> [2].
>>
>> This patch allows a non-first writer to ignore the handoff bit if it
>> is not originally set or initiated by the first waiter. This patch is
>> shown to be effective in fixing the lockup problem reported in [1].
>>
>> [1] 
>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfREWPrEAA$ 
>>
>> [2] 
>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/__;!!ACWV5N9M2RV99hQ!MtLCpFzIovhhzojTACH793QVlr1c2eCOsHnjbuB6O3ydj5cqhlZUjVchbfnbMeD4kQ5AsgjVmX0hGfQCd2bHaQ$ 
>>
>>
>> Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more 
>> consistent")
>> Signed-off-by: Waiman Long <longman@...hat.com>
>> Tested-by: Mel Gorman <mgorman@...hsingularity.net>
> 
> Acked-by:  John Donnelly <john.p.donnelly@...cle.com>
> 
>> ---
>>   kernel/locking/rwsem.c | 30 ++++++++++++++++++++----------
>>   1 file changed, 20 insertions(+), 10 deletions(-)
>>
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 9d1db4a54d34..ffd6188d4a7c 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -335,8 +335,6 @@ struct rwsem_waiter {
>>       struct task_struct *task;
>>       enum rwsem_waiter_type type;
>>       unsigned long timeout;
>> -
>> -    /* Writer only, not initialized in reader */
>>       bool handoff_set;
>>   };
>>   #define rwsem_first_waiter(sem) \
>> @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore 
>> *sem,
>>                * to give up the lock), request a HANDOFF to
>>                * force the issue.
>>                */
>> -            if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
>> -                time_after(jiffies, waiter->timeout)) {
>> -                adjustment -= RWSEM_FLAG_HANDOFF;
>> -                lockevent_inc(rwsem_rlock_handoff);
>> +            if (time_after(jiffies, waiter->timeout)) {
>> +                if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
>> +                    adjustment -= RWSEM_FLAG_HANDOFF;
>> +                    lockevent_inc(rwsem_rlock_handoff);
>> +                }
>> +                waiter->handoff_set = true;
>>               }
>>               atomic_long_add(-adjustment, &sem->count);
>> @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, 
>> struct rwsem_waiter *waiter,
>>   static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
>>                       struct rwsem_waiter *waiter)
>>   {
>> -    bool first = rwsem_first_waiter(sem) == waiter;
>> +    struct rwsem_waiter *first =  rwsem_first_waiter(sem);
>>       long count, new;
>>       lockdep_assert_held(&sem->wait_lock);
>> @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct 
>> rw_semaphore *sem,
>>           bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
>>           if (has_handoff) {
>> -            if (!first)
>> +            /*
>> +             * Honor handoff bit and yield only when the first
>> +             * waiter is the one that set it. Otherwisee, we
>> +             * still try to acquire the rwsem.
>> +             */
>> +            if (first->handoff_set && (waiter != first))
>>                   return false;
>> -            /* First waiter inherits a previously set handoff bit */
>> -            waiter->handoff_set = true;
>> +            /*
>> +             * First waiter can inherit a previously set handoff
>> +             * bit and spin on rwsem if lock acquisition fails.
>> +             */
>> +            if (waiter == first)
>> +                waiter->handoff_set = true;
>>           }
>>           new = count;
>> @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore 
>> *sem, long count, unsigned int stat
>>       waiter.task = current;
>>       waiter.type = RWSEM_WAITING_FOR_READ;
>>       waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
>> +    waiter.handoff_set = false;
>>       raw_spin_lock_irq(&sem->wait_lock);
>>       if (list_empty(&sem->wait_list)) {
> 
> 
> Hi,
> 
> ping .
> 
> 
> Can we get this reviewed ?
> 
> 
> The offending commit:
> 
>   d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more  
> consistent")
> 
> 
> is found in all these branches:
> 
>    kernel_dot_org/linux-stable.git    linux-5.15.y           - 76723ed1fb89
>    kernel_dot_org/linux-stable.git    linux-5.16.y           - d257cc8cb8d5
>    kernel_dot_org/linux-stable.git    linux-5.17.y           - d257cc8cb8d5
>    kernel_dot_org/linux-stable.git    linux-5.18.y           - d257cc8cb8d5
>    kernel_dot_org/linux-stable.git    master                 - d257cc8cb8d5
> 
> 
> 

Hi,

This issue now in :  v5.19-rc7

Can we get this reviewed ?



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ