[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260109165628.Lt2MGP7M@linutronix.de>
Date: Fri, 9 Jan 2026 17:56:28 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Thomas Gleixner <tglx@...utronix.de>, Florian Albertz <linux@...m.net>
Cc: mingo@...hat.com, linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: PROBLEM: Kernel 6.17 newly deadlocks futex
On 2025-12-19 21:07:13 [+0100], Thomas Gleixner wrote:
> On Fri, Dec 19 2025 at 11:02, Florian Albertz wrote:
…
> > clone(child, malloc(STACK_SIZE) + STACK_SIZE, CLONE_VM, NULL, NULL, NULL);
> >
> > // And now this futex wait never wakes from kernel 6.17 onwards.
> > syscall(SYS_futex, fut, FUTEX_WAIT_PRIVATE, 0, NULL, NULL, 0);
> > }
>
> The below should fix that. It's not completely correct because the
> resulting hash sizing looks at current->signal->threads. As signal is
> not shared each resulting process accounts for their own threads. Fixing
> that needs some more thoughts.
I'm not sure if I mix things up or it was based on an earlier version
where things were different but if I'm right then PeterZ said if someone
uses CLONE_VM without CLONE_THREAD then he can keep the pieces.
Using only CLONE_VM is okay (well it is not but is not causing the
problem here). Using CLONE_VM for some clone() invocations and CLONE_VM
+ CLONE_THREAD for other is causing the problem.
Who is doing this? Some exotic early container runtime?
CLONE_VM without CLONE_THREAD is common with CLONE_VFORK and in this
case we don't want to create the private hash.
I'm not sure if it is worth the effort. The wrong or not accurate
get_nr_threads() shouldn't be a problem given the situation. I would
suggest to limit it to "CLONE_THREAD | CLONE_VM" or "!CLONE_THREAD &&
CLONE_VM" if we really want to support this.
> Thanks,
>
> tglx
Sebastian
Powered by blists - more mailing lists