lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP01T74hRYCkrqz4JKqXH7ya0ykBfX4_6611q-TO52o1TZsfjg@mail.gmail.com>
Date: Thu, 13 Feb 2025 07:20:46 +0100
From: Kumar Kartikeya Dwivedi <memxor@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Barret Rhoden <brho@...gle.com>, Linus Torvalds <torvalds@...ux-foundation.org>, 
	Will Deacon <will@...nel.org>, Waiman Long <llong@...hat.com>, Alexei Starovoitov <ast@...nel.org>, 
	Andrii Nakryiko <andrii@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Martin KaFai Lau <martin.lau@...nel.org>, Eduard Zingerman <eddyz87@...il.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Tejun Heo <tj@...nel.org>, Josh Don <joshdon@...gle.com>, 
	Dohyun Kim <dohyunkim@...gle.com>, linux-arm-kernel@...ts.infradead.org, 
	kernel-team@...a.com
Subject: Re: [PATCH bpf-next v2 09/26] rqspinlock: Protect waiters in queue
 from stalls

On Mon, 10 Feb 2025 at 11:17, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Thu, Feb 06, 2025 at 02:54:17AM -0800, Kumar Kartikeya Dwivedi wrote:
> > Implement the wait queue cleanup algorithm for rqspinlock. There are
> > three forms of waiters in the original queued spin lock algorithm. The
> > first is the waiter which acquires the pending bit and spins on the lock
> > word without forming a wait queue. The second is the head waiter that is
> > the first waiter heading the wait queue. The third form is of all the
> > non-head waiters queued behind the head, waiting to be signalled through
> > their MCS node to overtake the responsibility of the head.
> >
> > In this commit, we are concerned with the second and third kind. First,
> > we augment the waiting loop of the head of the wait queue with a
> > timeout. When this timeout happens, all waiters part of the wait queue
> > will abort their lock acquisition attempts.
>
> Why? Why terminate the whole wait-queue?
>
> I *think* I understand, but it would be good to spell out. Also, in the
> comment.

Ack. The main reason is that we eschew per-waiter timeouts with one
applied at the head of the wait queue.
This allows everyone to break out faster once we've seen the owner /
pending waiter not responding for the timeout duration from the head.
Secondly, it avoids complicated synchronization, because when not
leaving in FIFO order, prev's next pointer needs to be fixed up etc.

Let me know if this explanation differs from your understanding.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ