lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <518185c1-c03a-7e8c-9d09-f67e42c9bc82@redhat.com>
Date:   Fri, 24 Jan 2020 19:38:50 -0500
From:   Waiman Long <longman@...hat.com>
To:     Alex Kogan <alex.kogan@...cle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>, linux@...linux.org.uk,
        Ingo Molnar <mingo@...hat.com>,
        Will Deacon <will.deacon@....com>,
        Arnd Bergmann <arnd@...db.de>, linux-arch@...r.kernel.org,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>, hpa@...or.com, x86@...nel.org,
        Hanjun Guo <guohanjun@...wei.com>,
        Jan Glauber <jglauber@...vell.com>,
        Steven Sistare <steven.sistare@...cle.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        dave.dice@...cle.com
Subject: Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance
 into CNA

On 1/24/20 4:27 PM, Alex Kogan wrote:
>
>> On Jan 24, 2020, at 4:12 PM, Waiman Long <longman@...hat.com> wrote:
>>
>> On 1/24/20 3:09 PM, Alex Kogan wrote:
>>>>> We also probably do not want those “prioritized” threads to disrupt
>>>>> normal
>>>>> CNA operation. E.g., if the main queue looks like T1_1, P2_1, T1_2,
>>>>> …, where
>>>>> T1_x is a thread running on node 1 and P2_1 is a prioritized thread
>>>>> running
>>>>> on node 2, we want to pass the lock from T1_1 to P2_1 and then to T1_2
>>>>> (rather than have P2_1 to scan for another thread on node 2).
>>>>>
>>>>> There is a way to achieve that — when we pass the lock to P2_1,
>>>>> we can set its numa node field to 1. This means that we need to
>>>>> reset the numa
>>>>> node field in cna_init_node(), but AFAICT this is relatively cheap.
>>>>> The rest
>>>>> of the CNA logic should not change.
>>>> I won't recommend doing that. If the lock cacheline has been moved
>>>> from node 1 to 2, I will say it is better to stick with node 2 rather
>>>> than switching back to node 1. That will mean that the secondary
>>>> queue may contain lock waiters from the same nodes, but they will
>>>> eventually be flushed back to the primary queue.
>>>>
>>> That’s right, assuming we do not reset intra_node count when
>>> transferring the
>>> lock to a prioritized thread from another node. Otherwise, we may starve
>>> waiters in the secondary queue. 
>>>
>>> Still, that can make the lock even less fair to non-prioritized
>>> threads. When
>>> you flush the secondary queue, the preference may remain with the same
>>> node. This will not happen in the current form of CNA, as we never get 
>>> threads from the preferred node in the secondary queue.
>> That is true.
>>
>> However, it is no different from the current scheme that a waiter from
>> another node may have to wait for 64k other waiters to go first before
>> it has a chance to get it. Now that waiter can be from the same node as
>> well.
> The difference is that in the current form of CNA, the preferred node _will
> change after 64k lock transitions. In the change you propose, this is no
> longer the case. It may take another ~64k transitions for that to happen.
> More generally, I think this makes the analysis of the lock behavior more
> convoluted.
>
> I think we should treat those prioritized threads as “wild” cards, passing the 
> lock through them, but keeping the preferred node intact. This will potentially
> cost one extra lock migration, but will make reasoning about the lock
> behavior easier.

It seems like you prefer mathematical purity than practical
consideration. I won't object to that if you insist that is the right
way to go. Just be mindful that you may need to add a preferred node
value to do that. It will also complicate the code, but it is your choice.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ