lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 17 Jun 2022 14:57:55 -0400
From:   Waiman Long <longman@...hat.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Shakeel Butt <shakeelb@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Will Deacon <will@...nel.org>, Roman Penyaev <rpenyaev@...e.de>
Subject: Re: [PATCH] locking/rwlocks: do not starve writers

On 6/17/22 13:45, Eric Dumazet wrote:
> On Fri, Jun 17, 2022 at 7:42 PM Waiman Long <longman@...hat.com> wrote:
>> On 6/17/22 11:24, Eric Dumazet wrote:
>>> On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@...hat.com> wrote:
>>>> On 6/17/22 10:57, Shakeel Butt wrote:
>>>>> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@...hat.com> wrote:
>>>>>> On 6/17/22 08:07, Peter Zijlstra wrote:
>>>>>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>>>>>>>> --- a/kernel/locking/qrwlock.c
>>>>>>>> +++ b/kernel/locking/qrwlock.c
>>>>>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>>>>>>>         /*
>>>>>>>>          * Readers come here when they cannot get the lock without waiting
>>>>>>>>          */
>>>>>>>> -    if (unlikely(in_interrupt())) {
>>>>>>>> -            /*
>>>>>>>> -             * Readers in interrupt context will get the lock immediately
>>>>>>>> -             * if the writer is just waiting (not holding the lock yet),
>>>>>>>> -             * so spin with ACQUIRE semantics until the lock is available
>>>>>>>> -             * without waiting in the queue.
>>>>>>>> -             */
>>>>>>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>>>>>>>> -            return;
>>>>>>>> -    }
>>>>>>>>         atomic_sub(_QR_BIAS, &lock->cnts);
>>>>>>>>
>>>>>>>>         trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
>>>>>>> This is known to break tasklist_lock.
>>>>>>>
>>>>>> We certainly can't break the current usage of tasklist_lock.
>>>>>>
>>>>>> I am aware of this problem with networking code and is thinking about
>>>>>> either relaxing the check to exclude softirq or provide a
>>>>>> read_lock_unfair() variant for networking use.
>>>>> read_lock_unfair() for networking use or tasklist_lock use?
>>>> I mean to say read_lock_fair(), but it could also be the other way
>>>> around. Thanks for spotting that.
>>>>
>>> If only tasklist_lock is problematic and needs the unfair variant,
>>> then changing a few read_lock() for tasklist_lock will be less
>>> invasive than ~1000 read_lock() elsewhere....
>> After a second thought, I think the right way is to introduce a fair
>> variant, if needed. If an arch isn't using qrwlock, the native rwlock
>> implementation will be unfair. In that sense, unfair rwlock is the
>> default. We will only need to change the relevant network read_lock()
>> calls to use the fair variant which will still be unfair if qrwlock
>> isn't used. We are not going to touch other read_lock call that don't
>> care about fair or unfair.
>>
> Hmm... backporting this kind of invasive change to stable kernels will
> be a daunting task.
>
> Were rwlocks always unfair, and we have been lucky ?
>
Yes, rwlocks was always unfair and it always had this kind of soft 
lockup problem and scalability problem because of cacheline bouncing. 
That was reason of creating qrwlock which can at least provide a fair 
rwlock at task context. Now we have systems with more and more cpus and 
that is the reason why you are seeing it all over again with the 
networking code.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ