lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <980d882c-01b8-2ce1-663f-41a8a337f350@redhat.com>
Date:   Mon, 24 Oct 2022 11:55:53 -0400
From:   Waiman Long <longman@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        linux-kernel@...r.kernel.org, john.p.donnelly@...cle.com,
        Hillf Danton <hdanton@...a.com>,
        Mukesh Ojha <quic_mojha@...cinc.com>,
        Ting11 Wang 王婷 <wangting11@...omi.com>
Subject: Re: [PATCH v3 2/5] locking/rwsem: Limit # of null owner retries for
 handoff writer

On 10/24/22 09:31, Peter Zijlstra wrote:
> On Mon, Oct 17, 2022 at 05:13:53PM -0400, Waiman Long wrote:
>> Commit 91d2a812dfb9 ("locking/rwsem: Make handoff writer optimistically
>> spin on owner") assumes that when the owner field is changed to NULL,
>> the lock will become free soon.  That assumption may not be correct
>> especially if the handoff writer doing the spinning is a RT task which
>> may preempt another task from completing its action of either freeing
>> the rwsem or properly setting up owner.
> I'm confused again -- rwsem_*_owner() has
> lockdep_assert_preemption_disabled(). But more specifically; why can the
> RT task preempt a lock-op like that?

There is a special case raised by Mukesh that can happen. I quoted his 
text here:

---------------------------

Looks like, There is still a window for a race.

There is a chance when a reader who came first added it's BIAS and goes 
to slowpath and before it gets added to wait list it got preempted by RT 
task which  goes to slowpath as well and being the first waiter gets its 
hand-off bit set and not able to get the lock due to following condition 
in rwsem_try_write_lock()

  630                 if (count & RWSEM_LOCK_MASK) {  ==> reader has 
sets its bias
..
...

  634
  635                         new |= RWSEM_FLAG_HANDOFF;
  636                 } else {
  637                         new |= RWSEM_WRITER_LOCKED;


---------------------->----------------------->-------------------------

First reader (1)          writer(2) RT task             Lock holder(3)

It sets
RWSEM_READER_BIAS.
while it is going to
slowpath(as the lock
was held by (3)) and
before it got added
to the waiters list
it got preempted
by (2).
                         RT task also takes
                         the slowpath and add              release the
                         itself into waiting list          rwsem lock
             and since it is the first         clear the
                         it is the next one to get         owner.
                         the lock but it can not
                         get the lock as (count &
                         RWSEM_LOCK_MASK) is set
                         as (1) has added it but
                         not able to remove its
             adjustment.

----------------------

To fix that we either has to disable preemption in down_read() and 
reenable it in rwsem_down_read_slowpath after decrementing the 
RWSEM_READER_BIAS or to limit the number of trylock-spinning attempt 
like this patch. The latter approach seems a bit less messy and I am 
going to take it back out anyway in patch 4. I will put a summary of 
that special case in the patch description.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ