[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez31aVnoBiMMjUczbmThWnRGmod4yppgMVqf2Nu5-hjU2g@mail.gmail.com>
Date: Fri, 25 Jul 2025 14:00:18 +0200
From: Jann Horn <jannh@...gle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>,
Rik van Riel <riel@...riel.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Harry Yoo <harry.yoo@...cle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/rmap: Add anon_vma lifetime debug check
On Fri, Jul 25, 2025 at 1:32 PM Lorenzo Stoakes
<lorenzo.stoakes@...cle.com> wrote:
> On Thu, Jul 24, 2025 at 09:13:50PM +0200, Jann Horn wrote:
> > If an anon page is mapped into userspace, its anon_vma must be alive,
> > otherwise rmap walks can hit UAF.
>
> This is extremely weird. I guess it's a UAF of vma->anon_vma
>
> Because we:
>
> put_anon_vma()
> ->__put_anon_vma()
> ->anon_vma_free() (also the root anon_vma... interestingly!)
FWIW, the thing we're trying to access can't be the root anon_vma,
since KASAN says the UAF'd object was allocated from anon_vma_fork.
> But none of this obviously updates the vma to set vma->anon_vma to NULL.
>
> So yeah god almighty. To get this, we must be dereffing vma->anon_vma
> somewhere unexpected.
I don't see how vma->anon_vma is directly relevant. I think the
relevant things are:
An anon_vma is only kept alive as long as at least one associated VMA
is still alive.
An anon folio may outlive the VMAs it comes from, so it may also
outlive its associated anon_vma.
When an anon_vma goes away, it does not clear the ->mapping of
associated folios (this is an intentional design choice).
When an rmap traversal wants to go from a folio to the associated
anon_vma, it (under RCU) checks that the mapcount is non-zero, which
implies that there must still be associated VMAs, which implies that
the associated anon_vma must still be alive; and then it dereferences
the ->mapping.
I think the ASAN splat indicates that syzkaller must fairly often get
into situations where the ->mapping pointer is dangling despite the
mapcount being non-zero, but syzkaller likely only actually hits the
UAF when the kernel gets memory pressure by chance and does rmap walks
on the reclaim path.
So there are several invariants we're relying on here, and breaking
any one of these could allow an rmap traversal to cause UAF; a
non-exhaustive list:
1. An anon folio is only mapped into VMAs that are associated with the
folio's anon_vma.
2. Removing an anon folio mapping reduces the anon folio's mapcount
before the VMA can go away.
3. VMA splitting/merging/forking properly preserves the anon_vma association.
4. If the anon-exclusive bit is set, the folio is only mapped in a
single place (otherwise swapout+swapin could erroneously set
RMAP_EXCLUSIVE, causing the swapped-in folio to be associated with the
wrong anon_vma).
5. When a VMA is associated with an anon_vma, it is always also
associated with the corresponding root anon_vma (necessary because
non-RMAP_EXCLUSIVE swapin falls back to associating the folio with the
root anon_vma).
6. If two VMAs in the same process have the same ->anon_vma, their
anonvma chains must be the same (otherwise VMA merging can break
stuff).
> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
>
> Will try to take a look when I get a chance.
>
> > walks that seems to indicate that there can be pages with elevated mapcount
> > whose anon_vma has already been freed, but I think we never figured out
> > what the cause is; and syzkaller only hit these UAFs when memory pressure
> > randomly caused reclaim to rmap-walk the affected pages, so it of course
> > didn't manage to create a reproducer.
>
> Fun.
>
> Please hook me in (I mean you're going to anyway right :P) on this stuff,
> as I'm looking to rework the anon_vma stuff so am naturally interested in
> any and all rmap anon stuff.
>
> For my sins ;)
>
> Maybe I"ll dig into these syzkallers.
>
> >
> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous pages to
> > hopefully catch such issues more reliably.
>
> Good idea.
>
> >
> > Implementation note: I'm checking IS_ENABLED(CONFIG_DEBUG_VM) because,
> > unlike the checks above, this one would otherwise be hard to write such
> > that it completely compiles away in non-debug builds by itself, without
> > looking extremely ugly.
>
> David already addressed.
>
> >
> > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
> > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
> >
> > Signed-off-by: Jann Horn <jannh@...gle.com>
>
> Nit below, and pending David's requests, but LGTM so:
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Thanks!
> > ---
> > include/linux/rmap.h | 13 +++++++++++++
> > 1 file changed, 13 insertions(+)
> >
> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> > index c4f4903b1088..ba694c436f59 100644
> > --- a/include/linux/rmap.h
> > +++ b/include/linux/rmap.h
> > @@ -449,6 +449,19 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
> > default:
> > VM_WARN_ON_ONCE(true);
> > }
> > +
> > + /*
> > + * Anon folios must have an associated live anon_vma as long as they're
> > + * mapped into userspace.
> > + * Part of the purpose of the atomic_read() is to make KASAN check that
> > + * the anon_vma is still alive.
> > + */
> > + if (IS_ENABLED(CONFIG_DEBUG_VM) && PageAnonNotKsm(page)) {
> > + unsigned long mapping = (unsigned long)folio->mapping;
> > + struct anon_vma *anon_vma = (void *)(mapping - PAGE_MAPPING_ANON);
> > +
> > + VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0, folio);
>
> Maybe stupid question, but wouldn't we pick this up with KASAN? Or would we
> pick it up far too late I guess?
I mean, it depends on the exact nature of the issue. If the issue is
that we somehow end up with anon folios mapped into VMAs that are not
associated with the anon folio's anon_vma, then I think the only time
we'd notice would be if we randomly end up doing rmap walks of the
folios, as syzkaller did above.
> We're sort of relying on this
>
> a. being a UAF
>
> b. the thing we're UAF-ing not either corrupting this field or (if that
> memory is actually reused as an anon_vma - I'm not familiar with slab
> caches - so maybe it's quite likely - getting its refcount incremented.
KASAN sees the memory read I'm doing with this atomic_read(), so in
KASAN builds, if this is a UAF, it should trigger a KASAN splat
(modulo KASAN limitations around when UAF can be detected). Basically,
in KASAN builds, the actual explicit check I'm doing here is only
relevant if the object has not yet been freed. That's why I wrote the
comment "Part of the purpose of the atomic_read() is to make KASAN
check that the anon_vma is still alive.".
> Which is fine - because hey this is the only way we can get a hint that
> this is happening, but I think we should describe what we're doing.
Sure, I can make the comment more verbose.
Powered by blists - more mailing lists