lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <359cc93a-fce0-5af2-0fd5-81999fad186b@redhat.com>
Date:   Mon, 7 Nov 2022 11:49:01 -0500
From:   Waiman Long <longman@...hat.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Jan Kara <jack@...e.cz>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mel Gorman <mgorman@...e.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>
Subject: Re: Crash with PREEMPT_RT on aarch64 machine

On 11/7/22 10:10, Sebastian Andrzej Siewior wrote:
> + locking, arm64
>
> On 2022-11-07 14:56:36 [+0100], Jan Kara wrote:
>>> spinlock_t and raw_spinlock_t differ slightly in terms of locking.
>>> rt_spin_lock() has the fast path via try_cmpxchg_acquire(). If you
>>> enable CONFIG_DEBUG_RT_MUTEXES then you would force the slow path which
>>> always acquires the rt_mutex_base::wait_lock (which is a raw_spinlock_t)
>>> while the actual lock is modified via cmpxchg.
>> So I've tried enabling CONFIG_DEBUG_RT_MUTEXES and indeed the corruption
>> stops happening as well. So do you suspect some bug in the CPU itself?
> If it is only enabling CONFIG_DEBUG_RT_MUTEXES (and not whole lockdep)
> then it looks very suspicious.
> CONFIG_DEBUG_RT_MUTEXES enables a few additional checks but the main
> part is that rt_mutex_cmpxchg_acquire() + rt_mutex_cmpxchg_release()
> always fail (and so the slowpath under a raw_spinlock_t is done).
>
> So if it is really the fast path (rt_mutex_cmpxchg_acquire()) then it
> somehow smells like the CPU is misbehaving.
>
> Could someone from the locking/arm64 department check if the locking in
> RT-mutex (rtlock_lock()) is correct?
>
> rtmutex locking uses try_cmpxchg_acquire(, ptr, ptr) for the fastpath
> (and try_cmpxchg_release(, ptr, ptr) for unlock).
> Now looking at it again, I don't see much difference compared to what
> queued_spin_trylock() does except the latter always operates on 32bit
> value instead a pointer.

Both the fast path of queued spinlock and rt_spin_lock are using 
try_cmpxchg_acquire(), the only difference I saw is the size of the data 
to be cmpxchg'ed. qspinlock uses 32-bit integer whereas rt_spin_lock 
uses 64-bit pointer. So I believe it is more on how the arm64 does 
cmpxchg. I believe there are two different ways of doing it depending on 
whether LSE atomics is available in the platform. So exactly what arm64 
system is being used here and what hardware capability does it have?

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ