linux-kernel - Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <2a7a3ea8-7a94-52d4-b8ef-581de28e0063@redhat.com>
Date:   Tue, 16 Jul 2019 10:50:09 -0400
From:   Waiman Long <longman@...hat.com>
To:     Alex Kogan <alex.kogan@...cle.com>
Cc:     linux@...linux.org.uk, peterz@...radead.org, mingo@...hat.com,
        will.deacon@....com, arnd@...db.de, linux-arch@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        tglx@...utronix.de, bp@...en8.de, hpa@...or.com, x86@...nel.org,
        guohanjun@...wei.com, jglauber@...vell.com,
        steven.sistare@...cle.com, daniel.m.jordan@...cle.com,
        dave.dice@...cle.com, rahul.x.yadav@...cle.com
Subject: Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow
 path of qspinlock

On 7/16/19 10:29 AM, Alex Kogan wrote:
>
>> On Jul 15, 2019, at 7:22 PM, Waiman Long <longman@...hat.com
>> <mailto:longman@...hat.com>> wrote:
>>
>> On 7/15/19 5:30 PM, Waiman Long wrote:
>>>> -#ifndef _GEN_PV_LOCK_SLOWPATH
>>>> +#if !defined(_GEN_PV_LOCK_SLOWPATH) && !defined(_GEN_CNA_LOCK_SLOWPATH)
>>>>  
>>>>  #include <linux/smp.h>
>>>>  #include <linux/bug.h>
>>>> @@ -77,18 +77,14 @@
>>>>  #define MAX_NODES	4
>>>>  
>>>>  /*
>>>> - * On 64-bit architectures, the mcs_spinlock structure will be 16 bytes in
>>>> - * size and four of them will fit nicely in one 64-byte cacheline. For
>>>> - * pvqspinlock, however, we need more space for extra data. To accommodate
>>>> - * that, we insert two more long words to pad it up to 32 bytes. IOW, only
>>>> - * two of them can fit in a cacheline in this case. That is OK as it is rare
>>>> - * to have more than 2 levels of slowpath nesting in actual use. We don't
>>>> - * want to penalize pvqspinlocks to optimize for a rare case in native
>>>> - * qspinlocks.
>>>> + * On 64-bit architectures, the mcs_spinlock structure will be 20 bytes in
>>>> + * size. For pvqspinlock or the NUMA-aware variant, however, we need more
>>>> + * space for extra data. To accommodate that, we insert two more long words
>>>> + * to pad it up to 36 bytes.
>>>>   */
>>> The 20 bytes figure is wrong. It is actually 24 bytes for 64-bit as the
>>> mcs_spinlock structure is 8-byte aligned. For better cacheline
>>> alignment, I will like to keep mcs_spinlock to 16 bytes as before.
>>> Instead, you can use encode_tail() to store the CNA node pointer in
>>> "locked". For instance, use (encode_tail() << 1) in locked to
>>> distinguish it from the regular locked=1 value.
>>
>> Actually, the encoded tail value is already shift left either 16 bits
>> or 9 bits. So there is no need to shift it. You can assigned it directly:
>>
>> mcs->locked = cna->encoded_tail;
>>
>> You do need to change the type of locked to "unsigned int", though,
>> for proper comparison with "1".
>>
> Got it, thanks.
>
I forgot to mention that I would like to see a boot command line option
to force off and maybe on as well the numa qspinlock code. This can help
in testing as you don't need to build 2 separate kernels, one with
NUMA_AWARE_SPINLOCKS on and one with it off. For small 2-socket systems,
numa qspinlock may not help much. So an option to turn it off can be
useful. Xen also have an option to turns off PV qspinlock.

-Longman