linux-kernel - Re: [tip: locking/urgent] futex: Allow to resize the private local hash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFR8EuMg82aMCvjo@mozart.vkv.me>
Date: Thu, 19 Jun 2025 14:07:30 -0700
From: Calvin Owens <calvin@...nvd.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, "Lai, Yi" <yi1.lai@...ux.intel.com>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
 hash

On Wednesday 06/18 at 15:47 -0700, Calvin Owens wrote:
> On Wednesday 06/18 at 13:56 -0700, Calvin Owens wrote:
> > ( Dropping linux-tip-commits from Cc )
> >
> > On Wednesday 06/18 at 19:09 +0200, Sebastian Andrzej Siewior wrote:
> > > On 2025-06-18 09:49:18 [-0700], Calvin Owens wrote:
> > > > Didn't get much out of lockdep unfortunately.
> > > >
> > > > It notices the corruption in the spinlock:
> > > >
> > > >     BUG: spinlock bad magic on CPU#2, cargo/4129172
> > > >      lock: 0xffff8881410ecdc8, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
> > >
> > > Yes. Which is what I assumed while I suggested this. But it complains
> > > about bad magic. It says the magic is 0xdead4ead but this is
> > > SPINLOCK_MAGIC. I was expecting any value but this one.
> > >
> > > > That was followed by this WARN:
> > > >
> > > >     ------------[ cut here ]------------
> > > >     rcuref - imbalanced put()
> > > >     WARNING: CPU: 2 PID: 4129172 at lib/rcuref.c:266 rcuref_put_slowpath+0x55/0x70
> > >
> > > This is "reasonable". If the lock is broken, the remaining memory is
> > > probably garbage anyway. It complains there that the reference put due
> > > to invalid counter.
> > >
> > > …
> > > > The oops after that is from a different task this time, but it just
> > > > looks like slab corruption:
> > > >
> > > …
> > >
> > > The previous complained an invalid free from within the exec.
> > >
> > > > No lock/rcu splats at all.
> > > It exploded before that could happen.
> > >
> > > > > If it still explodes without LTO, would you mind trying gcc?
> > > >
> > > > Will do.
> > >
> > > Thank you.
> > >
> > > > Haven't had much luck isolating what triggers it, but if I run two copies
> > > > of these large build jobs in a loop, it reliably triggers in 6-8 hours.
> > > >
> > > > Just to be clear, I can only trigger this on the one machine. I ran it
> > > > through memtest86+ yesterday and it passed, FWIW, but I'm a little
> > > > suspicious of the hardware right now too. I double checked that
> > > > everything in the BIOS related to power/perf is at factory settings.
> > >
> > > But then it is kind of odd that it happens only with the futex code.
> >
> > I think the missing ingredient was PREEMPT: the 2nd machine has been
> > trying for over a day, but I rebuilt its kernel with PREEMPT_FULL this
> > morning (still llvm), and it just hit a similar oops.
> >
> >     Oops: general protection fault, probably for non-canonical address 0x74656d2f74696750: 0000 [#1] SMP
> >     CPU: 10 UID: 1000 PID: 542469 Comm: cargo Not tainted 6.16.0-rc2-00045-g4663747812d1 #1 PREEMPT
> >     Hardware name: Gigabyte Technology Co., Ltd. A620I AX/A620I AX, BIOS F3 07/10/2023
> >     RIP: 0010:futex_hash+0x23/0x90
> >     Code: 1f 84 00 00 00 00 00 41 57 41 56 53 48 89 fb e8 b3 04 fe ff 48 89 df 31 f6 e8 79 00 00 00 48 8b 78 18 49 89 c6 48 85 ff 74 55 <80> 7f 21 00 75 4f f0 83 07 01 79 49 e8 fc 17 37 00 84 c0 75 40 e8
> >     RSP: 0018:ffffc9002e46fcd8 EFLAGS: 00010202
> >     RAX: ffff888a68e25c40 RBX: ffffc9002e46fda0 RCX: 0000000036616534
> >     RDX: 00000000ffffffff RSI: 0000000910180c00 RDI: 74656d2f7469672f
> >     RBP: 00000000000000b0 R08: 000000000318dd0d R09: 000000002e117cb0
> >     R10: 00000000318dd0d0 R11: 000000000000001b R12: 0000000000000000
> >     R13: 000055e79b431170 R14: ffff888a68e25c40 R15: ffff8881ea0ae900
> >     FS:  00007f1b6037b580(0000) GS:ffff8898a528b000(0000) knlGS:0000000000000000
> >     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >     CR2: 0000555830170098 CR3: 0000000d73e93000 CR4: 0000000000350ef0
> >     Call Trace:
> >      <TASK>
> >      futex_wait_setup+0x7e/0x1d0
> >      __futex_wait+0x63/0x120
> >      ? __futex_wake_mark+0x40/0x40
> >      futex_wait+0x5b/0xd0
> >      ? hrtimer_dummy_timeout+0x10/0x10
> >      do_futex+0x86/0x120
> >      __x64_sys_futex+0x10a/0x180
> >      do_syscall_64+0x48/0x4f0
> >      entry_SYSCALL_64_after_hwframe+0x4b/0x53
> >
> > I also enabled DEBUG_PREEMPT, but that didn't print any additional info.
> >
> > I'm testing a GCC kernel on both machines now.
> 
> Machine #2 oopsed with the GCC kernel after just over an hour:
> 
>     BUG: unable to handle page fault for address: ffff88a91eac4458
>     #PF: supervisor read access in kernel mode
>     #PF: error_code(0x0000) - not-present page
>     PGD 4401067 P4D 4401067 PUD 0
>     Oops: Oops: 0000 [#1] SMP
>     CPU: 4 UID: 1000 PID: 881756 Comm: cargo Not tainted 6.16.0-rc2-gcc-00045-g4663747812d1 #1 PREEMPT
>     Hardware name: Gigabyte Technology Co., Ltd. A620I AX/A620I AX, BIOS F3 07/10/2023
>     RIP: 0010:futex_hash+0x16/0x90
>     Code: 4d 85 e4 74 99 4c 89 e7 e8 07 51 80 00 eb 8f 0f 1f 44 00 00 41 54 55 48 89 fd 53 e8 14 f2 fd ff 48 89 ef 31 f6 e8 da f6 ff ff <48> 8b 78 18 48 89 c3 48 85 ff 74 0c 80 7f 21 00 75 06 f0 83 07 01
>     RSP: 0018:ffffc9002973fcf8 EFLAGS: 00010282
>     RAX: ffff88a91eac4440 RBX: ffff888d5a170000 RCX: 00000000add26115
>     RDX: 0000001c49080440 RSI: 00000000236034e8 RDI: 00000000f1a67530
>     RBP: ffffc9002973fdb8 R08: 00000000eb13f1af R09: ffffffff829c0fc0
>     R10: 0000000000000246 R11: 0000000000000000 R12: ffff888d5a1700f0
>     R13: ffffc9002973fdb8 R14: ffffc9002973fd70 R15: 0000000000000002
>     FS:  00007f64614ba9c0(0000) GS:ffff888cccceb000(0000) knlGS:0000000000000000
>     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     CR2: ffff88a91eac4458 CR3: 000000015e508000 CR4: 0000000000350ef0
>     Call Trace:
>      <TASK>
>      futex_wait_setup+0x51/0x1b0
>      __futex_wait+0xc0/0x120
>      ? __futex_wake_mark+0x50/0x50
>      futex_wait+0x55/0xe0
>      ? hrtimer_setup_sleeper_on_stack+0x30/0x30
>      do_futex+0x91/0x120
>      __x64_sys_futex+0xfc/0x1d0
>      do_syscall_64+0x44/0x1130
>      entry_SYSCALL_64_after_hwframe+0x4b/0x53
>     RIP: 0033:0x7f64615bd74d
>     Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab c6 0b 00 f7 d8 64 89 01 48
>     RSP: 002b:00007ffea50a6cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>     RAX: ffffffffffffffda RBX: 00007f64615bd730 RCX: 00007f64615bd74d
>     RDX: 0000000000000080 RSI: 0000000000000089 RDI: 000055bb7e399d90
>     RBP: 00007ffea50a6d20 R08: 0000000000000000 R09: 00007ffeffffffff
>     R10: 00007ffea50a6ce0 R11: 0000000000000246 R12: 000000001dcd6401
>     R13: 00007f64614e3710 R14: 000055bb7e399d90 R15: 0000000000000080
>      </TASK>
>     CR2: ffff88a91eac4458
>     ---[ end trace 0000000000000000 ]---
> 
> Two CPUs oopsed at once with that same stack, the config and vmlinux are
> uploaded in the git (https://github.com/jcalvinowens/lkml-debug-616).
> 
> I tried reproducing with DEBUG_PAGEALLOC, but the bug doesn't happen
> with it turned on.

I've been rotating through debug options one at a time, I've reproduced
the oops with the following which yielded no additional console output:

    * DEBUG_VM
    * PAGE_POISONING (and page_poison=1)
    * DEBUG_ATOMIC_SLEEP
    * DEBUG_PREEMPT

(No poison patterns showed up at all in the oops traces either.)

I am not able to reproduce the oops at all with these options:

    * DEBUG_PAGEALLOC_ENABLE_DEFAULT
    * SLUB_DEBUG_ON

I'm also experimenting with stress-ng as a reproducer, no luck so far.

A third machine with an older Skylake CPU died overnight, but nothing
was logged over netconsole. Luckily it actually has a serial header on
the motherboard, so that's wired up and it's running again, maybe it
dies in a different way that might be a better clue...

> > Thanks,
> > Calvin
> >
> > > Sebastian