lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 16 Nov 2021 09:52:02 -0500
From:   Waiman Long <longman@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        "Aiqun(Maria) Yu" <quic_aiquny@...cinc.com>
Cc:     Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        linux-kernel@...r.kernel.org, Davidlohr Bueso <dave@...olabs.net>,
        mazhenhua <mazhenhua@...omi.com>, Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH v5] locking/rwsem: Make handoff bit handling more
 consistent


On 11/16/21 04:24, Peter Zijlstra wrote:
> On Tue, Nov 16, 2021 at 10:14:20AM +0100, Peter Zijlstra wrote:
>> On Tue, Nov 16, 2021 at 10:52:42AM +0800, Aiqun(Maria) Yu wrote:
>>> On 11/16/2021 9:29 AM, Waiman Long wrote:
>>>> There are some inconsistency in the way that the handoff bit is being
>>>> handled in readers and writers.
>>>>
>>>> Firstly, when a queue head writer set the handoff bit, it will clear it
>>>> when the writer is being killed or interrupted on its way out without
>>>> acquiring the lock. That is not the case for a queue head reader. The
>>>> handoff bit will simply be inherited by the next waiter.
>>>>
>>>> Secondly, in the out_nolock path of rwsem_down_read_slowpath(), both
>>>> the waiter and handoff bits are cleared if the wait queue becomes empty.
>>>> For rwsem_down_write_slowpath(), however, the handoff bit is not checked
>>>> and cleared if the wait queue is empty. This can potentially make the
>>>> handoff bit set with empty wait queue.
>>>>
>>>> To make the handoff bit handling more consistent and robust, extract
>>>> out handoff bit clearing code into the new rwsem_del_waiter() helper
>>>> function.  The common function will only use atomic_long_andnot() to
>>>> clear bits when the wait queue is empty to avoid possible race condition.
>>> we do have race condition needed to be fixed with this change.
>> Indeed, let me edit the changelog to reflect that. Also, I think, it
>> needs a Reported-by:.
> How's something liks so then?
>
> ---
> Subject: locking/rwsem: Make handoff bit handling more consistent
> From: Waiman Long <longman@...hat.com>
> Date: Mon, 15 Nov 2021 20:29:12 -0500
>
> From: Waiman Long <longman@...hat.com>
>
> There are some inconsistency in the way that the handoff bit is being
> handled in readers and writers that lead to a race condition.
>
> Firstly, when a queue head writer set the handoff bit, it will clear
> it when the writer is being killed or interrupted on its way out
> without acquiring the lock. That is not the case for a queue head
> reader. The handoff bit will simply be inherited by the next waiter.
>
> Secondly, in the out_nolock path of rwsem_down_read_slowpath(), both
> the waiter and handoff bits are cleared if the wait queue becomes
> empty.  For rwsem_down_write_slowpath(), however, the handoff bit is
> not checked and cleared if the wait queue is empty. This can
> potentially make the handoff bit set with empty wait queue.
>
> Worse, the situation in rwsem_down_write_slowpath() relies on wstate,
> a variable set outside of the critical section containing the ->count
> manipulation, this leads to race condition where RWSEM_FLAG_HANDOFF
> can be double subtracted, corrupting ->count.
>
> To make the handoff bit handling more consistent and robust, extract
> out handoff bit clearing code into the new rwsem_del_waiter() helper
> function. Also, completely eradicate wstate; always evaluate
> everything inside the same critical section.
>
> The common function will only use atomic_long_andnot() to clear bits
> when the wait queue is empty to avoid possible race condition.  If the
> first waiter with handoff bit set is killed or interrupted to exit the
> slowpath without acquiring the lock, the next waiter will inherit the
> handoff bit.
>
> While at it, simplify the trylock for loop in
> rwsem_down_write_slowpath() to make it easier to read.
>
> Fixes: 4f23dbc1e657 ("locking/rwsem: Implement lock handoff to prevent lock starvation")
> Reported-by: Zhenhua Ma <mazhenhua@...omi.com>
> Suggested-by: Peter Zijlstra <peterz@...radead.org>
> Signed-off-by: Waiman Long <longman@...hat.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Link: https://lkml.kernel.org/r/20211116012912.723980-1-longman@redhat.com
> ---
>
Yes, that looks good to me. Thanks for the editing.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ