[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1d9fe0eb-11a0-4f8e-a8e7-57e1756193d3@app.fastmail.com>
Date: Fri, 19 Dec 2025 11:02:59 +0100
From: "Florian Albertz" <linux@...m.net>
To: tglx@...utronix.de, mingo@...hat.com
Cc: linux-kernel@...r.kernel.org
Subject: PROBLEM: Kernel 6.17 newly deadlocks futex
Hi everyone,
a program of mine started deadlocking in kernel 6.17 due to hanging in a
FUTEX_WAIT_PRIVATE call.
Now first off, due to factors outside of my control, I am using futexes with
the FUTEX_PRIVATE_FLAG while also working with child processes which aren't
spawned with CLONE_THREAD. They are however created with CLONE_VM.
This did work before (and works now, excluding the specific edge case demonstrated
below), but I would understand this not being fixed as FUTEX_PRIVATE_FLAG
is documented to be specifically about threaded programs. I would be very happy
if the previous behaviour could be restored though. Ideally with FUTEX_PRIVATE_FLAG
being documented to work as long as processes run in the same memory space.
But about the actual deadlock. The following program completes execution on
a released 6.16.10 kernel on x86_64. On kernel 6.17.9 as well as 6.18.1 it deadlocks.
Tested kernels are from the official archlinux repositories:
---
#define _GNU_SOURCE
#include <linux/futex.h>
#include <sched.h>
#include <stdint.h>
#include <stdlib.h>
#include <sys/syscall.h>
#include <unistd.h>
#define STACK_SIZE (1024 * 1024)
static uint32_t *fut;
static int noop(void *arg) { return 0; }
static int child(void *arg) {
// It is important this call to create a thread happens between
// the wait and wake calls.
//
// Due to the new behavior around `need_futex_hash_allocate_defaults`,
// the first clone which includes CLONE_THREAD (CLONE_VM is not enough)
// results in a change in how futex hashes are calculated.
clone(noop, malloc(STACK_SIZE) + STACK_SIZE,
CLONE_VM | CLONE_SIGHAND | CLONE_THREAD, NULL, NULL, NULL);
// So this now works with another hash and therefore does not wake the main
// process.
*fut = 1;
syscall(SYS_futex, fut, FUTEX_WAKE_PRIVATE, 1, NULL, NULL, 0);
return 0;
}
int main(int argc, char *argv[]) {
fut = calloc(1, sizeof(*fut));
// Now we create a new process sharing virtual memory but crucially without
// specifying CLONE_THREAD.
clone(child, malloc(STACK_SIZE) + STACK_SIZE, CLONE_VM, NULL, NULL, NULL);
// And now this futex wait never wakes from kernel 6.17 onwards.
syscall(SYS_futex, fut, FUTEX_WAIT_PRIVATE, 0, NULL, NULL, 0);
}
---
I realise for a fully reliable reproduction there would probably be more synchronization required,
but I hope the above is enough to demonstrate the problem. Same goes for error handling etc.
Also apologies for any other things causing confusion with the above code, I think this
reproduction may be the first C code I have written in years.
The issue does not occur if any process with CLONE_THREAD was created before the wait.
It does not occur if no process with CLONE_THREAD is created at all. And the code also
works as expected if the FUTEX_PRIVATE_FLAG is omitted.
Thank you for your time and work on the kernel, I'll gladly provide any further info you need.
Greetings and happy holidays,
Florian A.
Powered by blists - more mailing lists