linux-kernel - Re: Unfair qspinlocks on ARM64 without LSE atomics => 3ms delay in interrupt handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AS1PR10MB5675F191C6EC49ED8BEA9BECEB8B9@AS1PR10MB5675.EURPRD10.PROD.OUTLOOK.COM>
Date:   Mon, 27 Mar 2023 05:44:30 +0000
From:   "Bouska, Zdenek" <zdenek.bouska@...mens.com>
To:     Catalin Marinas <catalin.marinas@....com>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Will Deacon <will@...nel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Kiszka, Jan" <jan.kiszka@...mens.com>,
        "linux-rt-users@...r.kernel.org" <linux-rt-users@...r.kernel.org>,
        Nishanth Menon <nm@...com>, Puranjay Mohan <p-mohan@...com>
Subject: Re: Unfair qspinlocks on ARM64 without LSE atomics => 3ms delay in
 interrupt handling

>> So I confirmed that atomic operations from
>> arch/arm64/include/asm/atomic_ll_sc.h can be quite slow when they are
>> contested from second CPU.
>> 
>> Do you think that it is possible to create fair qspinlock implementation
>> on top of atomic instructions supported by ARM64 version 8 (no LSE atomic
>> instructions) without compromising performance in the uncontested case?
>> For example ARM64 could have custom queued_fetch_set_pending_acquire
>> implementation same as x86 has in arch/x86/include/asm/qspinlock.h. Is the
>> retry loop in irq_finalize_oneshot() ok together with the current ARM64
>> cpu_relax() implementation for processor with no LSE atomic instructions?
>
>So is the queued_fetch_set_pending_acquire() where it gets stuck or the
>earlier atomic_try_cmpxchg_acquire() before entering on the slow path? I
>guess both can fail in a similar way.

For me it was stuck on queued_fetch_set_pending_acquire().

Zdenek Bouska

--
Siemens, s.r.o
Siemens Advanta Development