linux-kernel - Re: [PATCH bpf-next v2 00/26] Resilient Queued Spin Lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250210093840.GE10324@noisy.programming.kicks-ass.net>
Date: Mon, 10 Feb 2025 10:38:40 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Will Deacon <will@...nel.org>, Waiman Long <llong@...hat.com>,
	Alexei Starovoitov <ast@...nel.org>,
	Andrii Nakryiko <andrii@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Martin KaFai Lau <martin.lau@...nel.org>,
	Eduard Zingerman <eddyz87@...il.com>,
	"Paul E. McKenney" <paulmck@...nel.org>, Tejun Heo <tj@...nel.org>,
	Barret Rhoden <brho@...gle.com>, Josh Don <joshdon@...gle.com>,
	Dohyun Kim <dohyunkim@...gle.com>,
	linux-arm-kernel@...ts.infradead.org, kernel-team@...a.com
Subject: Re: [PATCH bpf-next v2 00/26] Resilient Queued Spin Lock

On Thu, Feb 06, 2025 at 02:54:08AM -0800, Kumar Kartikeya Dwivedi wrote:


> Deadlock Detection
> ~~~~~~~~~~~~~~~~~~
> We handle two cases of deadlocks: AA deadlocks (attempts to acquire the
> same lock again), and ABBA deadlocks (attempts to acquire two locks in
> the opposite order from two distinct threads). Variants of ABBA
> deadlocks may be encountered with more than two locks being held in the
> incorrect order. These are not diagnosed explicitly, as they reduce to
> ABBA deadlocks.
> 
> Deadlock detection is triggered immediately when beginning the waiting
> loop of a lock slow path.
> 
> While timeouts ensure that any waiting loops in the locking slow path
> terminate and return to the caller, it can be excessively long in some
> situations. While the default timeout is short (0.5s), a stall for this
> duration inside the kernel can set off alerts for latency-critical
> services with strict SLOs.  Ideally, the kernel should recover from an
> undesired state of the lock as soon as possible.
> 
> A multi-step strategy is used to recover the kernel from waiting loops
> in the locking algorithm which may fail to terminate in a bounded amount
> of time.
> 
>  * Each CPU maintains a table of held locks. Entries are inserted and
>    removed upon entry into lock, and exit from unlock, respectively.
>  * Deadlock detection for AA locks is thus simple: we have an AA
>    deadlock if we find a held lock entry for the lock we’re attempting
>    to acquire on the same CPU.
>  * During deadlock detection for ABBA, we search through the tables of
>    all other CPUs to find situations where we are holding a lock the
>    remote CPU is attempting to acquire, and they are holding a lock we
>    are attempting to acquire. Upon encountering such a condition, we
>    report an ABBA deadlock.
>  * We divide the duration between entry time point into the waiting loop
>    and the timeout time point into intervals of 1 ms, and perform
>    deadlock detection until timeout happens. Upon entry into the slow
>    path, and then completion of each 1 ms interval, we perform detection
>    of both AA and ABBA deadlocks. In the event that deadlock detection
>    yields a positive result, the recovery happens sooner than the
>    timeout.  Otherwise, it happens as a last resort upon completion of
>    the timeout.
> 
> Timeouts
> ~~~~~~~~
> Timeouts act as final line of defense against stalls for waiting loops.
> The ‘ktime_get_mono_fast_ns’ function is used to poll for the current
> time, and it is compared to the timestamp indicating the end time in the
> waiter loop. Each waiting loop is instrumented to check an extra
> condition using a macro. Internally, the macro implementation amortizes
> the checking of the timeout to avoid sampling the clock in every
> iteration.  Precisely, the timeout checks are invoked every 64k
> iterations.
> 
> Recovery
> ~~~~~~~~

I'm probably bad at reading, but I failed to find anything that
explained how you recover from a deadlock.

Do you force unload the BPF program?