lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e419e15c-7bfc-4fc2-9089-e271a3b0576e@lucifer.local>
Date: Wed, 23 Jul 2025 20:01:06 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: Jann Horn <jannh@...gle.com>, Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Pedro Falcato <pfalcato@...e.de>, Linux-MM <linux-mm@...ck.org>,
        kernel list <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] hard-to-hit mm_struct UAF due to insufficiently careful
 vma_refcount_put() wrt SLAB_TYPESAFE_BY_RCU

On Wed, Jul 23, 2025 at 10:55:06AM -0700, Suren Baghdasaryan wrote:
> On Wed, Jul 23, 2025 at 10:50 AM Jann Horn <jannh@...gle.com> wrote:
> >
> > On Wed, Jul 23, 2025 at 7:32 PM Vlastimil Babka <vbabka@...e.cz> wrote:
> > > On 7/23/25 18:26, Jann Horn wrote:
> > > > There's a racy UAF in `vma_refcount_put()` when called on the
> > > > `lock_vma_under_rcu()` path because `SLAB_TYPESAFE_BY_RCU` is used
> > > > without sufficient protection against concurrent object reuse:
> > >
> > > Oof.
>
> Thanks for analyzing this Jann. Yeah, I missed the fact that
> vma_refcount_put() uses vma->vm_mm.
>
> > >
> > > > I'm not sure what the right fix is; I guess one approach would be to
> > > > have a special version of vma_refcount_put() for cases where the VMA
> > > > has been recycled by another MM that grabs an extra reference to the
> > > > MM? But then dropping a reference to the MM afterwards might be a bit
> > > > annoying and might require something like mmdrop_async()...
> > >
> > > Would we need mmdrop_async()? Isn't this the case for mmget_not_zero() and
> > > mmput_async()?
> >
> > Now I'm not sure anymore if either of those approaches would work,
> > because they rely on the task that's removing the VMA to wait until we
> > do __refcount_dec_and_test() before deleting the MM... but I don't
> > think we have any such guarantee...
>
> This is tricky. Let me look into it some more before suggesting any fixes.

Thanks Suren! :)

I feel the strong desire to document this seqnum approach as it is
intricate, so will find some time to do that for my own benefit at least.

The fact VMAs can be recycled like this at any time makes me super nervous,
so I wonder if we could find ways to, at least in a debug mode (perhaps
even in a CONFIG_DEBUG_VM_MAPLE_TREE-style 'we are fine with this being
very very slow sort of way), pick up on potentially super weird
small-race-window style issues like this.

Because it feels like debugging them 'in the wild' might be really horrid.

Or maybe it's even possible to shuffle things around and do some testing in
userland via the VMA userland tests... possibly pipe dream though given the
mechanisms that would need to be put there.

It's sort of hard now to separate VMA locks from VMA operations in general,
so that's something I need to think about anyway.

But I'm almost certainly going to document this in an 'internal' portion of
the process addrs doc page we have, at least to teach myself the deeper
internals...

Cheers, Lorenzo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ