[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aFYEpPIwhlL1WvR0@mozart.vkv.me>
Date: Fri, 20 Jun 2025 18:02:28 -0700
From: Calvin Owens <calvin@...nvd.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, "Lai, Yi" <yi1.lai@...ux.intel.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
hash
On Friday 06/20 at 11:56 -0700, Calvin Owens wrote:
> On Friday 06/20 at 12:31 +0200, Sebastian Andrzej Siewior wrote:
> > On 2025-06-19 14:07:30 [-0700], Calvin Owens wrote:
> > > > Machine #2 oopsed with the GCC kernel after just over an hour:
> > > >
> > > > BUG: unable to handle page fault for address: ffff88a91eac4458
> > > > RIP: 0010:futex_hash+0x16/0x90
> > …
> > > > Call Trace:
> > > > <TASK>
> > > > futex_wait_setup+0x51/0x1b0
> > …
> >
> > The futex_hash_bucket pointer has an invalid ->priv pointer.
> > This could be use-after-free or double-free. I've been looking through
> > your config and you don't have CONFIG_SLAB_FREELIST_* set. I don't
> > remember which one but one of the two has a "primitiv" double free
> > detection.
> >
> > …
> > > I am not able to reproduce the oops at all with these options:
> > >
> > > * DEBUG_PAGEALLOC_ENABLE_DEFAULT
> > > * SLUB_DEBUG_ON
> >
> > SLUB_DEBUG_ON is something that would "reliably" notice double free.
> > If you drop SLUB_DEBUG_ON (but keep SLUB_DEBUG) then you can boot with
> > slab_debug=f keeping only the consistency checks. The "poison" checks
> > would be excluded for instance. That allocation is kvzalloc() but it
> > should be small on your machine to avoid vmalloc() and use only
> > kmalloc().
>
> I'll try slab_debug=f next.
I just hit the oops with SLUB_DEBUG and slab_debug=f, but nothing new
was logged.
> > > I'm also experimenting with stress-ng as a reproducer, no luck so far.
> >
> > Not sure what you are using there. I think cargo does:
> > - lock/ unlock in a threads
> > - create new thread which triggers auto-resize
> > - auto-resize gets delayed due to lock/ unlock in other threads (the
> > reference is held)
>
> I've tried various combinations of --io, --fork, --exec, --futex, --cpu,
> --vm, and --forkheavy. It's not mixing the operations in threads as I
> understand it, so I guess it won't ever do anything like what you're
> describing no matter what stressors I run?
>
> I did get this message once, something I haven't seen before:
>
> [33024.247423] [ T281] sched: DL replenish lagged too much
>
> ...but maybe that's my fault for overloading it so much.
>
> > And now something happens leading to what we see.
> > _Maybe_ the cargo application terminates/ execs before the new struct is
> > assigned in an unexpected way.
> > The regular hash bucket has reference counting so it should raise
> > warnings if it goes wrong. I haven't seen those.
> >
> > > A third machine with an older Skylake CPU died overnight, but nothing
> > > was logged over netconsole. Luckily it actually has a serial header on
> > > the motherboard, so that's wired up and it's running again, maybe it
> > > dies in a different way that might be a better clue...
> >
> > So far I *think* that cargo does something that I don't expect and this
> > leads to a memory double-free. The SLUB_DEBUG_ON hopefully delays the
> > process long enough that the double free does not trigger.
> >
> > I think I'm going to look for a random rust packet that is using cargo
> > for building (unless you have a recommendation) and look what it is
> > doing. It was always cargo after all. Maybe this brings some light.
>
> The list of things in my big build that use cargo is pretty short:
>
> === Dependendency Snapshot ===
> Dep =mc:house:cargo-native.do_install
> Package=mc:house:cargo-native.do_populate_sysroot
> RDep =mc:house:cargo-c-native.do_prepare_recipe_sysroot
> mc:house:cargo-native.do_create_spdx
> mc:house:cbindgen-native.do_prepare_recipe_sysroot
> mc:house:librsvg-native.do_prepare_recipe_sysroot
> mc:house:librsvg.do_prepare_recipe_sysroot
> mc:house:libstd-rs.do_prepare_recipe_sysroot
> mc:house:python3-maturin-native.do_prepare_recipe_sysroot
> mc:house:python3-maturin-native.do_populate_sysroot
> mc:house:python3-rpds-py.do_prepare_recipe_sysroot
> mc:house:python3-setuptools-rust-native.do_prepare_recipe_sysroot
>
> I've tried building each of those targets alone (and all of them
> together) in a loop, but that hasn't triggered anything. I guess that
> other concurrent builds are necessary to trigger whatever this is.
>
> I tried using stress-ng --vm and --cpu together to "load up" the machine
> while running the isolated targets, but that hasn't worked either.
>
> If you want to run *exactly* what I am, clone this unholy mess:
>
> https://github.com/jcalvinowens/meta-house
>
> ...setup for yocto and install kas as described here:
>
> https://docs.yoctoproject.org/ref-manual/system-requirements.html#ubuntu-and-debian
> https://github.com/jcalvinowens/meta-house/blob/6f6a9c643169fc37ba809f7230261d0e5255b6d7/README.md#kas
>
> ...and run (for the 32-thread machine):
>
> BB_NUMBER_THREADS="48" PARALLEL_MAKE="-j 36" kas build kas/walnascar.yaml -- -k
>
> Fair warning, it needs a *lot* of RAM at the high concurrency, I have
> 96GB with 128GB of swap to spill into. It needs ~500GB of disk space if
> it runs to completion and downloads ~15GB of tarballs when it starts.
>
> Annoyingly it won't work if the system compiler is gcc-15 right now (the
> verison of glib it has won't build, haven't had a chance to fix it yet).
>
> > > > > Thanks,
> > > > > Calvin
> >
> > Sebastian
Powered by blists - more mailing lists