linux-kernel - Re: PROBLEM: Kernel 6.17 newly deadlocks futex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20260109165628.Lt2MGP7M@linutronix.de>
Date: Fri, 9 Jan 2026 17:56:28 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Thomas Gleixner <tglx@...utronix.de>, Florian Albertz <linux@...m.net>
Cc: mingo@...hat.com, linux-kernel@...r.kernel.org,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: PROBLEM: Kernel 6.17 newly deadlocks futex

On 2025-12-19 21:07:13 [+0100], Thomas Gleixner wrote:
> On Fri, Dec 19 2025 at 11:02, Florian Albertz wrote:
…
> >     clone(child, malloc(STACK_SIZE) + STACK_SIZE, CLONE_VM, NULL, NULL, NULL);
> >
> >     // And now this futex wait never wakes from kernel 6.17 onwards.
> >     syscall(SYS_futex, fut, FUTEX_WAIT_PRIVATE, 0, NULL, NULL, 0);
> > }
> 
> The below should fix that. It's not completely correct because the
> resulting hash sizing looks at current->signal->threads. As signal is
> not shared each resulting process accounts for their own threads. Fixing
> that needs some more thoughts.

I'm not sure if I mix things up or it was based on an earlier version
where things were different but if I'm right then PeterZ said if someone
uses CLONE_VM without CLONE_THREAD then he can keep the pieces.

Using only CLONE_VM is okay (well it is not but is not causing the
problem here). Using CLONE_VM for some clone() invocations and CLONE_VM
+ CLONE_THREAD for other is causing the problem.
Who is doing this? Some exotic early container runtime?

CLONE_VM without CLONE_THREAD is common with CLONE_VFORK and in this
case we don't want to create the private hash.
I'm not sure if it is worth the effort. The wrong or not accurate
get_nr_threads() shouldn't be a problem given the situation. I would
suggest to limit it to "CLONE_THREAD | CLONE_VM" or "!CLONE_THREAD &&
CLONE_VM" if we really want to support this.

> Thanks,
> 
>         tglx

Sebastian