[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250825160406.ZVcVPStz@linutronix.de>
Date: Mon, 25 Aug 2025 18:04:06 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>, x86-ml <x86@...nel.org>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] locking/urgent for v6.17-rc1
On 2025-08-22 17:28:02 [-0700], Sean Christopherson wrote:
> > > https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com
> >
> > I somehow missed it. Can you try rc2 with the patch I just sent?
>
> No dice, fails with the same signature.
>
> I got a trimmed down reproduer. Load KVM, run this in the background (in a loop)
> to constantly trigger try_to_wake_up() on relevant tasks (needs to be run as root):
>
> echo Y > /sys/module/kvm/parameters/nx_huge_pages
> echo N > /sys/module/kvm/parameters/nx_huge_pages
> sleep .2
>
> and then run the hardware_disable_test KVM selftest (from
> tools/testing/selftests/kvm/hardware_disable_test.c).
With this information I was able to reproduce what you had in the link a
the top. I don't know why it happens. It hangs and lockdep isn't happy
with the lock - it seems to be a valid task_struct::pi_lock lock for one
of the kvm-nx-lpage-recovery threads.
I got rid of all free_percpu() and kvfree() in futex/core.c (and leak
memory, yes) and this still happens.
I was able to avoid the crash if I skip the assignment of the second
private hash but it turned out that I was not patient enough.
The strange part here is that the private hash is not used. The private
hash gets allocated and resized because hardware_disable_test creates a
lot of threads. But it is not used, it just sits around and waits to be
cleared.
And it also seems to happen if I tell futex_hash_allocate_default() not
to do anything at all.
kvm-nx-lpage-recovery shares the mm but it grabs a reference.
It might be a coincidence but the task, on which the wakeup chokes,
seems to be gone according to my traces. And with
diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
--- a/kernel/vhost_task.c
+++ b/kernel/vhost_task.c
@@ -75,7 +84,10 @@ static int vhost_task_fn(void *data)
*/
void vhost_task_wake(struct vhost_task *vtsk)
{
- wake_up_process(vtsk->task);
+ mutex_lock(&vtsk->exit_mutex);
+ if (!test_bit(VHOST_TASK_FLAGS_KILLED, &vtsk->flags))
+ wake_up_process(vtsk->task);
+ mutex_unlock(&vtsk->exit_mutex);
}
EXPORT_SYMBOL_GPL(vhost_task_wake);
it doesn't crash anymore. Could it attempts to wake a task that is gone?
> Strace on hardware_disable_test spewed a whole pile of these
>
> wait4(32861, 0x7ffc66475dec, WNOHANG, NULL) = 0
> futex(0x7fb735c43000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
That is a shared FUTEX and is probably part pthread_join().
> immediately before the crash. I assume it corresponds to this:
>
> /* Child is still running, keep waiting. */
> if (pid != waitpid(pid, &status, WNOHANG))
> continue;
>
> I also got a new splat on the "WARN_ON_ONCE(ret < 0);" at the end of __futex_ref_atomic_end().
> This happened during boot; AFAICT our userspace was setting up cgroups. In this
> case, the system hung and I had to reboot.
This is odd
> ------------[ cut here ]------------
> WARNING: CPU: 45 PID: 0 at kernel/futex/core.c:1604 futex_ref_rcu+0xbf/0xf0
…
> Heh, and two more when booting a different system. Guess it's my lucky day.
> This time whatever went sideways didn't appear to be fatal as the system booted
> and I could ssh in. One is the same WARN as above, and the second WARN on the
> system hit the
>
> WARN_ON_ONCE(atomic_long_read(&mm->futex_atomic) != 0);
>
> in futex_hash_allocate().
This means the counter don't add up after the switch. Not sure how. This
seems to be a random task but it might be part of the previous splat.
Sebastian
Powered by blists - more mailing lists