lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a43ebfa-0c4f-4029-ad81-125aba68b764@suse.cz>
Date: Fri, 25 Jul 2025 17:38:03 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Jann Horn <jannh@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 David Hildenbrand <david@...hat.com>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Rik van Riel
 <riel@...riel.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
 Harry Yoo <harry.yoo@...cle.com>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check

On 7/25/25 16:44, Jann Horn wrote:
> On Fri, Jul 25, 2025 at 4:11 PM Vlastimil Babka <vbabka@...e.cz> wrote:
>> On 7/25/25 14:16, Jann Horn wrote:
>> > If an anon folio is mapped into userspace, its anon_vma must be alive,
>> > otherwise rmap walks can hit UAF.
>> >
>> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
>> > walks that seems to indicate that there can be pages with elevated mapcount
>> > whose anon_vma has already been freed, but I think we never figured out
>> > what the cause is; and syzkaller only hit these UAFs when memory pressure
>> > randomly caused reclaim to rmap-walk the affected pages, so it of course
>> > didn't manage to create a reproducer.
>> >
>> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
>> > hopefully catch such issues more reliably.
>> >
>> > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
>> > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
>> >
>> > Acked-by: David Hildenbrand <david@...hat.com>
>> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
>> > Signed-off-by: Jann Horn <jannh@...gle.com>

Acked-by: Vlastimil Babka <vbabka@...e.cz>

>> > ---
>> > Changes in v2:
>> > - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
>> > - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
>> > - more verbose comment (Lorenzo)
>> > - replaced "page" mentions with "folio" in commit message
>> > - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
>> > ---
>> >  include/linux/rmap.h | 22 ++++++++++++++++++++++
>> >  1 file changed, 22 insertions(+)
>> >
>> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>> > index 20803fcb49a7..6cd020eea37a 100644
>> > --- a/include/linux/rmap.h
>> > +++ b/include/linux/rmap.h
>> > @@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
>> >       default:
>> >               VM_WARN_ON_ONCE(true);
>> >       }
>> > +
>> > +     /*
>> > +      * Anon folios must have an associated live anon_vma as long as they're
>> > +      * mapped into userspace.
>> > +      * Note that the atomic_read() mainly does two things:
>> > +      *
>> > +      * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
>> > +      *    check that the associated anon_vma has not yet been freed (subject
>>
>> I think more precisely it checks that the slab folio hosting the anon_vma
>> could not have been yet freed, IIUC? If the anon_vma itself has been freed
>> then this will not trigger.
> 
> The point of CONFIG_SLUB_RCU_DEBUG, which I'm talking about here, is
> that it allows KASAN to catch UAF once the anon_vma has been freed and
> an RCU grace period has passed; it is not necessary that the slab
> folio has been freed.
> 
> You can see that working in the linked syzkaller reports - KASAN
> tracked the object as freed after slab_free_after_rcu_debug(), which
> is an RCU callback scheduled from kmem_cache_free().
> 
>> > +      *    to KASAN's usual limitations). This check will pass if the
>> > +      *    anon_vma's refcount has already dropped to 0 but an RCU grace
>> > +      *    period hasn't passed since then.
>>
>> AFAIU this says it more accurately and matches my interpretation above?
>>
>> > +      * 2. If the anon_vma has not yet been freed, it checks that the
>> > +      *    anon_vma still has a nonzero refcount (as opposed to being in the
>> > +      *    middle of an RCU delay for getting freed).
>>
>> Again the RCU delay would apply to the slab page, unless you talk about the
>> CONFIG_SLUB_RCU_DEBUG specific path (IIRC).
> 
> Yes, right, the "RCU delay" in the second bullet point refers to
> CONFIG_SLUB_RCU_DEBUG.

OK I misunderstood that while bullet 1 notes the check only happens with
CONFIG_SLUB_RCU_DEBUG, I assumed the description is still meant semantically
from the point of anon_vma users (particularly what "freed" means - moment
of kfree() vs KASAN quarantine). Once considered from the point of what
happens with the object under CONFIG_SLUB_RCU_DEBUG, it all makes sense.

> Here I'm saying "If the anon_vma has not yet been freed" because
> that's the only case in which I can reliably say what will happen, and
> this is the main case that isn't already covered by the first bullet
> point in a CONFIG_SLUB_RCU_DEBUG build.
> 
>> That said, I wonder if here in __folio_rmap_sanity_checks() we are even in a
>> situation where we rely on SLAB_TYPESAFE_BY_RCU in order to not touch
>> something that's not anon_vma anymore? I think we expect it to exist?
> 
> Yes, we expect it to exist. That's why I'm not just asserting that the
> anon_vma is still considered live by KASAN, but also that its refcount
> is non-zero.
> 
>> Can we
>> thus invent a CONFIG_SLUB_RCU_DEBUG-specific assert that assert the anon_vma
>> itself has not been freed yet (i.e. even if within a grace period?).
> 
> That is essentially what I'm doing - checking that the count is
> nonzero verifies that it's not within a grace period, and the implicit
> KASAN check verifies it can't be in a KASAN quarantine after the grace
> period is over.

OK I guess that's sufficient and we'd be unlikely to find a bug scenario
where anon_vma was kfree'd() and thus KASAN with CONFIG_SLUB_RCU_DEBUG is
waiting for the grace period, yet it doesn't have a zero refcount.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ