linux-kernel - Re: [tip: locking/urgent] futex: Allow to resize the private local hash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFYEpPIwhlL1WvR0@mozart.vkv.me>
Date: Fri, 20 Jun 2025 18:02:28 -0700
From: Calvin Owens <calvin@...nvd.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, "Lai, Yi" <yi1.lai@...ux.intel.com>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
 hash

On Friday 06/20 at 11:56 -0700, Calvin Owens wrote:
> On Friday 06/20 at 12:31 +0200, Sebastian Andrzej Siewior wrote:
> > On 2025-06-19 14:07:30 [-0700], Calvin Owens wrote:
> > > > Machine #2 oopsed with the GCC kernel after just over an hour:
> > > > 
> > > >     BUG: unable to handle page fault for address: ffff88a91eac4458
> > > >     RIP: 0010:futex_hash+0x16/0x90
> > …
> > > >     Call Trace:
> > > >      <TASK>
> > > >      futex_wait_setup+0x51/0x1b0
> > …
> > 
> > The futex_hash_bucket pointer has an invalid ->priv pointer.
> > This could be use-after-free or double-free. I've been looking through
> > your config and you don't have CONFIG_SLAB_FREELIST_* set. I don't
> > remember which one but one of the two has a "primitiv" double free
> > detection. 
> > 
> > …
> > > I am not able to reproduce the oops at all with these options:
> > > 
> > >     * DEBUG_PAGEALLOC_ENABLE_DEFAULT
> > >     * SLUB_DEBUG_ON
> > 
> > SLUB_DEBUG_ON is something that would "reliably" notice double free.
> > If you drop SLUB_DEBUG_ON (but keep SLUB_DEBUG) then you can boot with
> > slab_debug=f keeping only the consistency checks. The "poison" checks
> > would be excluded for instance. That allocation is kvzalloc() but it
> > should be small on your machine to avoid vmalloc() and use only
> > kmalloc().
> 
> I'll try slab_debug=f next.

I just hit the oops with SLUB_DEBUG and slab_debug=f, but nothing new
was logged.

> > > I'm also experimenting with stress-ng as a reproducer, no luck so far.
> > 
> > Not sure what you are using there. I think cargo does:
> > - lock/ unlock in a threads
> > - create new thread which triggers auto-resize
> > - auto-resize gets delayed due to lock/ unlock in other threads (the
> >   reference is held)
> 
> I've tried various combinations of --io, --fork, --exec, --futex, --cpu,
> --vm, and --forkheavy. It's not mixing the operations in threads as I
> understand it, so I guess it won't ever do anything like what you're
> describing no matter what stressors I run?
> 
> I did get this message once, something I haven't seen before:
> 
>     [33024.247423] [    T281] sched: DL replenish lagged too much
> 
> ...but maybe that's my fault for overloading it so much.
> 
> > And now something happens leading to what we see.
> > _Maybe_ the cargo application terminates/ execs before the new struct is
> > assigned in an unexpected way.
> > The regular hash bucket has reference counting so it should raise
> > warnings if it goes wrong. I haven't seen those.
> > 
> > > A third machine with an older Skylake CPU died overnight, but nothing
> > > was logged over netconsole. Luckily it actually has a serial header on
> > > the motherboard, so that's wired up and it's running again, maybe it
> > > dies in a different way that might be a better clue...
> > 
> > So far I *think* that cargo does something that I don't expect and this
> > leads to a memory double-free. The SLUB_DEBUG_ON hopefully delays the
> > process long enough that the double free does not trigger.
> > 
> > I think I'm going to look for a random rust packet that is using cargo
> > for building (unless you have a recommendation) and look what it is
> > doing. It was always cargo after all. Maybe this brings some light.
> 
> The list of things in my big build that use cargo is pretty short:
> 
>     === Dependendency Snapshot ===
>     Dep    =mc:house:cargo-native.do_install
>     Package=mc:house:cargo-native.do_populate_sysroot
>     RDep   =mc:house:cargo-c-native.do_prepare_recipe_sysroot
>             mc:house:cargo-native.do_create_spdx
>             mc:house:cbindgen-native.do_prepare_recipe_sysroot
>             mc:house:librsvg-native.do_prepare_recipe_sysroot
>             mc:house:librsvg.do_prepare_recipe_sysroot
>             mc:house:libstd-rs.do_prepare_recipe_sysroot
>             mc:house:python3-maturin-native.do_prepare_recipe_sysroot
>             mc:house:python3-maturin-native.do_populate_sysroot
>             mc:house:python3-rpds-py.do_prepare_recipe_sysroot
>             mc:house:python3-setuptools-rust-native.do_prepare_recipe_sysroot
> 
> I've tried building each of those targets alone (and all of them
> together) in a loop, but that hasn't triggered anything. I guess that
> other concurrent builds are necessary to trigger whatever this is.
> 
> I tried using stress-ng --vm and --cpu together to "load up" the machine
> while running the isolated targets, but that hasn't worked either.
> 
> If you want to run *exactly* what I am, clone this unholy mess:
> 
>     https://github.com/jcalvinowens/meta-house
> 
> ...setup for yocto and install kas as described here:
> 
>     https://docs.yoctoproject.org/ref-manual/system-requirements.html#ubuntu-and-debian
>     https://github.com/jcalvinowens/meta-house/blob/6f6a9c643169fc37ba809f7230261d0e5255b6d7/README.md#kas
> 
> ...and run (for the 32-thread machine):
> 
>     BB_NUMBER_THREADS="48" PARALLEL_MAKE="-j 36" kas build kas/walnascar.yaml -- -k
> 
> Fair warning, it needs a *lot* of RAM at the high concurrency, I have
> 96GB with 128GB of swap to spill into. It needs ~500GB of disk space if
> it runs to completion and downloads ~15GB of tarballs when it starts.
> 
> Annoyingly it won't work if the system compiler is gcc-15 right now (the
> verison of glib it has won't build, haven't had a chance to fix it yet).
> 
> > > > > Thanks,
> > > > > Calvin
> > 
> > Sebastian