[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5b8fdef2-c605-475f-9673-f42328db7128@lucifer.local>
Date: Sat, 3 May 2025 18:50:19 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Wei Yang <richard.weiyang@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
David Hildenbrand <david@...hat.com>, Pedro Falcato <pfalcato@...e.de>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 01/10] mm/mremap: introduce more mergeable mremap
via MREMAP_RELOCATE_ANON
On Sat, May 03, 2025 at 03:29:08PM +0100, Lorenzo Stoakes wrote:
> OK have dug into this some more with a drgn script to read actual kernel
> metadata state and it's simpler than I thought - the root anon_vma is
> self-childed, but descendent anon_vma's are not.
>
> We can correct this with a anon_vma->root == anon_vma check. I believe
> we're probably safe with anon_vma reuse, because in that instance the
> anon_vma would not be mapped a shared folio.
>
> However, to be safe, I will check this, and I as I said previously, I will
> add a number of tests explicitly tested forking scenarios.
>
> The respin should have this fully addressed.
>
> Thanks, Lorenzo
Note that in practice, this wouldn't have broken anything, as in this case you
would _have_ to have parent anon_vma's.
The root will hang around even if all VMA's unmapped also, we only clear down
anon_vma's once no references from the anon_vma exist, and by nature everything
below the root must reference it.
But the function is misleading as-is so needs fixing.
As for anon_vma re-use - this is not permitted for root anon_vma's so naturally
it requires a parented anon_vma, which again implies AVC's which the uncowed
parent check would pick up.
Additionally, the reuse implies that the folio is not mapped within the process
that first created the anon_vma, and thus the num_children counts remain correct
across the board.
See https://pastebin.com/raw/q6wzUMLi for a detailed diagram of both scenarios
with anon_vma parameters and linking derived from real-world kernel values
obtained via drgn.
These will form part of the added tests.
Cheers, Lorenzo
Powered by blists - more mailing lists