lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez23CPO-m6kPaEs8kLUfRVCN+QMbsEn7BocfaJuq=gRwaA@mail.gmail.com>
Date: Fri, 25 Jul 2025 16:44:31 +0200
From: Jann Horn <jannh@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>, 
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Rik van Riel <riel@...riel.com>, 
	"Liam R. Howlett" <Liam.Howlett@...cle.com>, Harry Yoo <harry.yoo@...cle.com>, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check

On Fri, Jul 25, 2025 at 4:11 PM Vlastimil Babka <vbabka@...e.cz> wrote:
> On 7/25/25 14:16, Jann Horn wrote:
> > If an anon folio is mapped into userspace, its anon_vma must be alive,
> > otherwise rmap walks can hit UAF.
> >
> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
> > walks that seems to indicate that there can be pages with elevated mapcount
> > whose anon_vma has already been freed, but I think we never figured out
> > what the cause is; and syzkaller only hit these UAFs when memory pressure
> > randomly caused reclaim to rmap-walk the affected pages, so it of course
> > didn't manage to create a reproducer.
> >
> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
> > hopefully catch such issues more reliably.
> >
> > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
> > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
> >
> > Acked-by: David Hildenbrand <david@...hat.com>
> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > Signed-off-by: Jann Horn <jannh@...gle.com>
> > ---
> > Changes in v2:
> > - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
> > - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
> > - more verbose comment (Lorenzo)
> > - replaced "page" mentions with "folio" in commit message
> > - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
> > ---
> >  include/linux/rmap.h | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> >
> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> > index 20803fcb49a7..6cd020eea37a 100644
> > --- a/include/linux/rmap.h
> > +++ b/include/linux/rmap.h
> > @@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
> >       default:
> >               VM_WARN_ON_ONCE(true);
> >       }
> > +
> > +     /*
> > +      * Anon folios must have an associated live anon_vma as long as they're
> > +      * mapped into userspace.
> > +      * Note that the atomic_read() mainly does two things:
> > +      *
> > +      * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
> > +      *    check that the associated anon_vma has not yet been freed (subject
>
> I think more precisely it checks that the slab folio hosting the anon_vma
> could not have been yet freed, IIUC? If the anon_vma itself has been freed
> then this will not trigger.

The point of CONFIG_SLUB_RCU_DEBUG, which I'm talking about here, is
that it allows KASAN to catch UAF once the anon_vma has been freed and
an RCU grace period has passed; it is not necessary that the slab
folio has been freed.

You can see that working in the linked syzkaller reports - KASAN
tracked the object as freed after slab_free_after_rcu_debug(), which
is an RCU callback scheduled from kmem_cache_free().

> > +      *    to KASAN's usual limitations). This check will pass if the
> > +      *    anon_vma's refcount has already dropped to 0 but an RCU grace
> > +      *    period hasn't passed since then.
>
> AFAIU this says it more accurately and matches my interpretation above?
>
> > +      * 2. If the anon_vma has not yet been freed, it checks that the
> > +      *    anon_vma still has a nonzero refcount (as opposed to being in the
> > +      *    middle of an RCU delay for getting freed).
>
> Again the RCU delay would apply to the slab page, unless you talk about the
> CONFIG_SLUB_RCU_DEBUG specific path (IIRC).

Yes, right, the "RCU delay" in the second bullet point refers to
CONFIG_SLUB_RCU_DEBUG.

Here I'm saying "If the anon_vma has not yet been freed" because
that's the only case in which I can reliably say what will happen, and
this is the main case that isn't already covered by the first bullet
point in a CONFIG_SLUB_RCU_DEBUG build.

> That said, I wonder if here in __folio_rmap_sanity_checks() we are even in a
> situation where we rely on SLAB_TYPESAFE_BY_RCU in order to not touch
> something that's not anon_vma anymore? I think we expect it to exist?

Yes, we expect it to exist. That's why I'm not just asserting that the
anon_vma is still considered live by KASAN, but also that its refcount
is non-zero.

> Can we
> thus invent a CONFIG_SLUB_RCU_DEBUG-specific assert that assert the anon_vma
> itself has not been freed yet (i.e. even if within a grace period?).

That is essentially what I'm doing - checking that the count is
nonzero verifies that it's not within a grace period, and the implicit
KASAN check verifies it can't be in a KASAN quarantine after the grace
period is over.

> I was wondering what actually does rely on SLAB_TYPESAFE_BY_RCU, thanks
> Lorenzo for pointing me to folio_get_anon_vma(). But that's only used
> elsewhere than here, no?

Yes; the point of this assertion is that folio_get_anon_vma() and
folio_lock_anon_vma_read() (which show up in the stack traces of the
two linked syzkaller reports) rely on the ->mapping not being
dangling; and they can happen in the background anytime as long as the
folio mapcount is elevated, but they only actually happen sporadically
(in particular under memory pressure, which syzkaller apparently
mostly triggers randomly and non-reproducibly). If we have bugs where
the ->mapping of a folio goes away while it still has an elevated
mapcount, then instead of randomly getting KASAN splats under memory
pressure, it would be nice to detect this reliably. So I figured a
nice place to check this is when we're decrementing the mapcount -
that should catch almost all cases, except ones where such a bug
happens because we somehow leaked both a mapcount and a refcount of a
folio.

> > +      */
> > +     if (folio_test_anon(folio) && !folio_test_ksm(folio)) {
>
> So you verified this compiles out completely without DEBUG_VM?

No, after David's suggestion to remove the explicit CONFIG_DEBUG_VM
check, I looked at the definitions of these things to check that it's
all just plain reads and arithmetic, and removed the CONFIG_DEBUG_VM
check, but haven't actually compile-tested it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ