[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFR8EuMg82aMCvjo@mozart.vkv.me>
Date: Thu, 19 Jun 2025 14:07:30 -0700
From: Calvin Owens <calvin@...nvd.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, "Lai, Yi" <yi1.lai@...ux.intel.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
hash
On Wednesday 06/18 at 15:47 -0700, Calvin Owens wrote:
> On Wednesday 06/18 at 13:56 -0700, Calvin Owens wrote:
> > ( Dropping linux-tip-commits from Cc )
> >
> > On Wednesday 06/18 at 19:09 +0200, Sebastian Andrzej Siewior wrote:
> > > On 2025-06-18 09:49:18 [-0700], Calvin Owens wrote:
> > > > Didn't get much out of lockdep unfortunately.
> > > >
> > > > It notices the corruption in the spinlock:
> > > >
> > > > BUG: spinlock bad magic on CPU#2, cargo/4129172
> > > > lock: 0xffff8881410ecdc8, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
> > >
> > > Yes. Which is what I assumed while I suggested this. But it complains
> > > about bad magic. It says the magic is 0xdead4ead but this is
> > > SPINLOCK_MAGIC. I was expecting any value but this one.
> > >
> > > > That was followed by this WARN:
> > > >
> > > > ------------[ cut here ]------------
> > > > rcuref - imbalanced put()
> > > > WARNING: CPU: 2 PID: 4129172 at lib/rcuref.c:266 rcuref_put_slowpath+0x55/0x70
> > >
> > > This is "reasonable". If the lock is broken, the remaining memory is
> > > probably garbage anyway. It complains there that the reference put due
> > > to invalid counter.
> > >
> > > …
> > > > The oops after that is from a different task this time, but it just
> > > > looks like slab corruption:
> > > >
> > > …
> > >
> > > The previous complained an invalid free from within the exec.
> > >
> > > > No lock/rcu splats at all.
> > > It exploded before that could happen.
> > >
> > > > > If it still explodes without LTO, would you mind trying gcc?
> > > >
> > > > Will do.
> > >
> > > Thank you.
> > >
> > > > Haven't had much luck isolating what triggers it, but if I run two copies
> > > > of these large build jobs in a loop, it reliably triggers in 6-8 hours.
> > > >
> > > > Just to be clear, I can only trigger this on the one machine. I ran it
> > > > through memtest86+ yesterday and it passed, FWIW, but I'm a little
> > > > suspicious of the hardware right now too. I double checked that
> > > > everything in the BIOS related to power/perf is at factory settings.
> > >
> > > But then it is kind of odd that it happens only with the futex code.
> >
> > I think the missing ingredient was PREEMPT: the 2nd machine has been
> > trying for over a day, but I rebuilt its kernel with PREEMPT_FULL this
> > morning (still llvm), and it just hit a similar oops.
> >
> > Oops: general protection fault, probably for non-canonical address 0x74656d2f74696750: 0000 [#1] SMP
> > CPU: 10 UID: 1000 PID: 542469 Comm: cargo Not tainted 6.16.0-rc2-00045-g4663747812d1 #1 PREEMPT
> > Hardware name: Gigabyte Technology Co., Ltd. A620I AX/A620I AX, BIOS F3 07/10/2023
> > RIP: 0010:futex_hash+0x23/0x90
> > Code: 1f 84 00 00 00 00 00 41 57 41 56 53 48 89 fb e8 b3 04 fe ff 48 89 df 31 f6 e8 79 00 00 00 48 8b 78 18 49 89 c6 48 85 ff 74 55 <80> 7f 21 00 75 4f f0 83 07 01 79 49 e8 fc 17 37 00 84 c0 75 40 e8
> > RSP: 0018:ffffc9002e46fcd8 EFLAGS: 00010202
> > RAX: ffff888a68e25c40 RBX: ffffc9002e46fda0 RCX: 0000000036616534
> > RDX: 00000000ffffffff RSI: 0000000910180c00 RDI: 74656d2f7469672f
> > RBP: 00000000000000b0 R08: 000000000318dd0d R09: 000000002e117cb0
> > R10: 00000000318dd0d0 R11: 000000000000001b R12: 0000000000000000
> > R13: 000055e79b431170 R14: ffff888a68e25c40 R15: ffff8881ea0ae900
> > FS: 00007f1b6037b580(0000) GS:ffff8898a528b000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000555830170098 CR3: 0000000d73e93000 CR4: 0000000000350ef0
> > Call Trace:
> > <TASK>
> > futex_wait_setup+0x7e/0x1d0
> > __futex_wait+0x63/0x120
> > ? __futex_wake_mark+0x40/0x40
> > futex_wait+0x5b/0xd0
> > ? hrtimer_dummy_timeout+0x10/0x10
> > do_futex+0x86/0x120
> > __x64_sys_futex+0x10a/0x180
> > do_syscall_64+0x48/0x4f0
> > entry_SYSCALL_64_after_hwframe+0x4b/0x53
> >
> > I also enabled DEBUG_PREEMPT, but that didn't print any additional info.
> >
> > I'm testing a GCC kernel on both machines now.
>
> Machine #2 oopsed with the GCC kernel after just over an hour:
>
> BUG: unable to handle page fault for address: ffff88a91eac4458
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 4401067 P4D 4401067 PUD 0
> Oops: Oops: 0000 [#1] SMP
> CPU: 4 UID: 1000 PID: 881756 Comm: cargo Not tainted 6.16.0-rc2-gcc-00045-g4663747812d1 #1 PREEMPT
> Hardware name: Gigabyte Technology Co., Ltd. A620I AX/A620I AX, BIOS F3 07/10/2023
> RIP: 0010:futex_hash+0x16/0x90
> Code: 4d 85 e4 74 99 4c 89 e7 e8 07 51 80 00 eb 8f 0f 1f 44 00 00 41 54 55 48 89 fd 53 e8 14 f2 fd ff 48 89 ef 31 f6 e8 da f6 ff ff <48> 8b 78 18 48 89 c3 48 85 ff 74 0c 80 7f 21 00 75 06 f0 83 07 01
> RSP: 0018:ffffc9002973fcf8 EFLAGS: 00010282
> RAX: ffff88a91eac4440 RBX: ffff888d5a170000 RCX: 00000000add26115
> RDX: 0000001c49080440 RSI: 00000000236034e8 RDI: 00000000f1a67530
> RBP: ffffc9002973fdb8 R08: 00000000eb13f1af R09: ffffffff829c0fc0
> R10: 0000000000000246 R11: 0000000000000000 R12: ffff888d5a1700f0
> R13: ffffc9002973fdb8 R14: ffffc9002973fd70 R15: 0000000000000002
> FS: 00007f64614ba9c0(0000) GS:ffff888cccceb000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffff88a91eac4458 CR3: 000000015e508000 CR4: 0000000000350ef0
> Call Trace:
> <TASK>
> futex_wait_setup+0x51/0x1b0
> __futex_wait+0xc0/0x120
> ? __futex_wake_mark+0x50/0x50
> futex_wait+0x55/0xe0
> ? hrtimer_setup_sleeper_on_stack+0x30/0x30
> do_futex+0x91/0x120
> __x64_sys_futex+0xfc/0x1d0
> do_syscall_64+0x44/0x1130
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
> RIP: 0033:0x7f64615bd74d
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab c6 0b 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffea50a6cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: ffffffffffffffda RBX: 00007f64615bd730 RCX: 00007f64615bd74d
> RDX: 0000000000000080 RSI: 0000000000000089 RDI: 000055bb7e399d90
> RBP: 00007ffea50a6d20 R08: 0000000000000000 R09: 00007ffeffffffff
> R10: 00007ffea50a6ce0 R11: 0000000000000246 R12: 000000001dcd6401
> R13: 00007f64614e3710 R14: 000055bb7e399d90 R15: 0000000000000080
> </TASK>
> CR2: ffff88a91eac4458
> ---[ end trace 0000000000000000 ]---
>
> Two CPUs oopsed at once with that same stack, the config and vmlinux are
> uploaded in the git (https://github.com/jcalvinowens/lkml-debug-616).
>
> I tried reproducing with DEBUG_PAGEALLOC, but the bug doesn't happen
> with it turned on.
I've been rotating through debug options one at a time, I've reproduced
the oops with the following which yielded no additional console output:
* DEBUG_VM
* PAGE_POISONING (and page_poison=1)
* DEBUG_ATOMIC_SLEEP
* DEBUG_PREEMPT
(No poison patterns showed up at all in the oops traces either.)
I am not able to reproduce the oops at all with these options:
* DEBUG_PAGEALLOC_ENABLE_DEFAULT
* SLUB_DEBUG_ON
I'm also experimenting with stress-ng as a reproducer, no luck so far.
A third machine with an older Skylake CPU died overnight, but nothing
was logged over netconsole. Luckily it actually has a serial header on
the motherboard, so that's wired up and it's running again, maybe it
dies in a different way that might be a better clue...
> > Thanks,
> > Calvin
> >
> > > Sebastian
Powered by blists - more mailing lists