linux-kernel - Re: [PATCH] locking/qspinlock: use xchg with

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aMqdaCkflusKi2hA@willie-the-truck>
Date: Wed, 17 Sep 2025 12:37:12 +0100
From: Will Deacon <will@...nel.org>
To: pengyu <pengyu@...inos.cn>
Cc: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com,
	boqun.feng@...il.com, longman@...hat.com,
	linux-kernel@...r.kernel.org, Mark Rutland <mark.rutland@....com>,
	t.haas@...bs.de, parri.andrea@...il.com, j.alglave@....ac.uk,
	luc.maranget@...ia.fr, paulmck@...nel.org,
	jonas.oberhauser@...weicloud.com, r.maseli@...bs.de,
	lkmm@...ts.linux.dev, stern@...land.harvard.edu
Subject: Re: [PATCH] locking/qspinlock: use xchg with _mb in slowpath for
 arm64

On Wed, Sep 17, 2025 at 06:51:18PM +0800, pengyu wrote:
> Yes, this issue occurred on a kunpeng920 96-core machine and only
> affected a small number of systems that had been running for over a
> year.
> 
> Vmcore Analysis:
> • Panic triggered by CPU 83 detecting a hard lockup at
>     queued_spin_lock_slowpath+0x1d8/0x320.
> 
> • Corresponding code:
>     arch_mcs_spin_lock_contended(&node->locked);
> 
> • The qspinlock involved was the rq lock, which showed a cleared state:
>     crash> rq.lock,cpu ffffad96ff2907c0
>       lock = {
>         raw_lock = {
>           {
>             val = {
>               counter = 0
>             },
>             {
>               locked = 0 '\000',
>               pending = 0 '\000'
>             },
>             {
>               locked_pending = 0,
>               tail = 0
>             }
>           }
>         }
>       },
>       cpu = 50,
> 
> • CPU 83’s MCS node remained in a locked=0 state, with no previous
> node found in the qnodes list.
>     crash> p qnodes:83
>     per_cpu(qnodes, 83) = $292 =
>      {{
>         mcs = {
>           next = 0x0,
>           locked = 0,
>           count = 1
>         }
>       },
>     crash> p qnodes | grep 83
>       [83]: ffffadd6bf7914c0
>     crash> p qnodes:all | grep ffffadd6bf7914c0
>     crash>
> 
> • Since rq->lock was cleared, no CPU could notify CPU 83.
> 
> This issue has occurred multiple times, but the root cause remains
> unclear. We suspect that CPU 83 may have failed to enqueue itself,
> potentially due to a failure in the xchg_tail atomic operation.

Hmm. For the lock word to be clear with a CPU spinning on its MCS node
then something has gone quite badly wrong. I think that would mean that:

  1. The spinning CPU has updated tail to point to its node (xchg_tail())
  2. The lock-owning CPU then erroneously cleared the tail field
     (atomic_try_cmpxchg_relaxed())

But for the cmpxchg() to succeed in (2) then the xchg() in (1) must be
ordered after it and the lock word wouldn't end up as zero. This is
because RmW atomics must be totally ordered for a given memory location
and that applies regardless of their memory ordering properties.

Of course, there could be _a_ bug here but, given the information you've
been able to provide, it's not obviously as "simple" as a missing memory
barrier. Have you confirmed that adding memory barriers makes the problem
go away?

If you're able to check the thread_info (via sp_el0) of CPU83 in your
example, it might be interesting to see whether or not the 'cpu' field
has been corrupted. For example, if it ends up being read as -1 then we
may compute a tail of 0 when enqueuing our MCS node into the lock word.

> It has been noted that the _relaxed version is used in xchx_tail, and we
> are uncertain whether this could lead to visibility issues—for example,
> if CPU 83 modifies lock->tail, but other CPUs fail to observe the
> change.
> 
> We are also checking if this is related：
>     https://lkml.kernel.org/r/cb83e3e4-9e22-4457-bf61-5614cc4396ad@tu-bs.de

Hopefully I (or somebody from Arm) can provide an update soon on this
topic but I wouldn't necessarily expect it to help you with this new
case.

Will