lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e9e8224-18c3-697e-aa8a-c485986d0cea@redhat.com>
Date:   Mon, 24 Oct 2022 09:50:56 -0400
From:   Waiman Long <longman@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        linux-kernel@...r.kernel.org, john.p.donnelly@...cle.com,
        Hillf Danton <hdanton@...a.com>,
        Mukesh Ojha <quic_mojha@...cinc.com>,
        Ting11 Wang 王婷 <wangting11@...omi.com>
Subject: Re: [PATCH v3 1/5] locking/rwsem: Prevent non-first waiter from
 spinning in down_write() slowpath

On 10/24/22 09:17, Peter Zijlstra wrote:
> On Mon, Oct 17, 2022 at 05:13:52PM -0400, Waiman Long wrote:
>> A non-first waiter can potentially spin in the for loop of
>> rwsem_down_write_slowpath() without sleeping but fail to acquire the
>> lock even if the rwsem is free if the following sequence happens:
>>
>>    Non-first waiter       First waiter      Lock holder
>>    ----------------       ------------      -----------
>>    Acquire wait_lock
>>    rwsem_try_write_lock():
>>      Set handoff bit if RT or
>>        wait too long
>>      Set waiter->handoff_set
>>    Release wait_lock
>>                           Acquire wait_lock
>>                           Inherit waiter->handoff_set
>>                           Release wait_lock
>> 					   Clear owner
>>                                             Release lock
>>    if (waiter.handoff_set) {
>>      rwsem_spin_on_owner(();
>>      if (OWNER_NULL)
>>        goto trylock_again;
>>    }
>>    trylock_again:
>>    Acquire wait_lock
>>    rwsem_try_write_lock():
>>       if (first->handoff_set && (waiter != first))
>>       	return false;
>>    Release wait_lock
>>
>> It is especially problematic if the non-first waiter is an RT task and
>> it is running on the same CPU as the first waiter as this can lead to
>> live lock.
> I'm struggling to connect the Changelog to the actual patch. I see the
> problem, but I don't see how the below helps or is even related to the
> described problem.

Sorry if the description isn't clear, I will rephrase it to make it 
clearer. The basic idea is that a non-first waiter can mistakenly 
believe that it can spin on the lock. However, when 
rwsem_try_write_lock() is called, it can never acquire the lock and move 
on because it is not the first waiter:

                        if (first->handoff_set && (waiter != first))
                                 return false;

If that waiter happen to be an RT task, it can block the real first 
waiter to acquire the lock if it happen to run the same CPU.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ