lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <64847aca-bb02-43e0-8951-33f18d6af193@lucifer.local>
Date: Mon, 7 Apr 2025 11:24:19 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Wei Yang <richard.weiyang@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        David Hildenbrand <david@...hat.com>,
        Matthew Wilcox <willy@...radead.org>, Rik van Riel <riel@...riel.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] mm/vma: fix incorrectly disallowed anonymous VMA
 merges

On Fri, Apr 04, 2025 at 11:32:31PM +0000, Wei Yang wrote:
> On Fri, Apr 04, 2025 at 02:04:10PM +0100, Lorenzo Stoakes wrote:
> >On Fri, Apr 04, 2025 at 12:53:15PM +0000, Wei Yang wrote:
> >> On Mon, Mar 17, 2025 at 09:15:03PM +0000, Lorenzo Stoakes wrote:
> >> [...]
> >> >However, we have a problem here - typically the vma passed here is the
> >> >destination VMA.
> >> >
> >> >For instance in vma_merge_existing_range() we invoke:
> >> >
> >> >can_vma_merge_left()
> >> >-> [ check that there is an immediately adjacent prior VMA ]
> >> >-> can_vma_merge_after()
> >> >  -> is_mergeable_vma() for general attribute check
> >> >-> is_mergeable_anon_vma([ proposed anon_vma ], prev->anon_vma, prev)
> >> >
> >> >So if we were considering a target unfaulted 'prev':
> >> >
> >> >	  unfaulted    faulted
> >> >	|-----------|-----------|
> >> >	|    prev   |    vma    |
> >> >	|-----------|-----------|
> >> >
> >> >This would call is_mergeable_anon_vma(NULL, vma->anon_vma, prev).
> >> >
> >> >The list_is_singular() check for vma->anon_vma_chain, an empty list on
> >> >fault, would cause this merge to _fail_ even though all else indicates a
> >> >merge.
> >> >
> >>
> >> Great spot. It is hiding there for 15 years.
> >
> >Thanks!
> >
> >>
> >> >Equally a simple merge into a next VMA would hit the same problem:
> >> >
> >> >	   faulted    unfaulted
> >> >	|-----------|-----------|
> >> >	|    vma    |    next   |
> >> >	|-----------|-----------|
> >> >
> >> [...]
> >> >---
> >> > mm/vma.c                |  81 +++++++++++++++++++++++---------
> >> > tools/testing/vma/vma.c | 100 +++++++++++++++++++++-------------------
> >> > 2 files changed, 111 insertions(+), 70 deletions(-)
> >> >
> >> >diff --git a/mm/vma.c b/mm/vma.c
> >> >index 5cdc5612bfc1..5418eef3a852 100644
> >> >--- a/mm/vma.c
> >> >+++ b/mm/vma.c
> >> >@@ -57,6 +57,22 @@ struct mmap_state {
> >> > 		.state = VMA_MERGE_START,				\
> >> > 	}
> >> >
> >> >+/*
> >> >+ * If, at any point, the VMA had unCoW'd mappings from parents, it will maintain
> >> >+ * more than one anon_vma_chain connecting it to more than one anon_vma. A merge
> >> >+ * would mean a wider range of folios sharing the root anon_vma lock, and thus
> >> >+ * potential lock contention, we do not wish to encourage merging such that this
> >> >+ * scales to a problem.
> >> >+ */
> >>
> >> I don't follow here. Take a look into do_wp_page(), where CoW happens. But I
> >> don't find where it will unlink parent anon_vma from vma->anon_vma_chain.
> >
> >Look at anon_vma_clone() in fork case. It's not the point of CoW that's the
> >issue, it's propagation of AVC's upon fork.
> >
> >>
> >> Per my understanding, the unlink behavior happens in unlink_anon_vma() which
> >> unlink all anon_vma on vma->anon_vma_chain. And the normal caller of
> >> unlink_anon_vma() is free_pgtables(). Other callers are on error path to
> >> release prepared data. From this perspective, I don't see the chance to unlink
> >> parent anon_vma from vma->anon_vma_chain either.
> >>
> >> But maybe I missed something. If it is not too bother, would you mind giving
> >> me a hint?
> >
> >What you're saying is 'we never go back and fix this up once unCoW'd' which is
> >true, but I don't want to write several page-length essays in comments, and this
> >is a sensible way of looking at things for the purposes of this check.
> >
> >In future, we may want to actually do something like this, if it makes sense.
> >
>
> Ok, this is the future plan instead of current behavior.

No, it also describes the current situation, there just isn't what you've
decided 'should' happen or is implied by this naming later.

I find this to be rather pedantic, honestly. If you're trying to base your
understanding of an incredibly subtle and difficult part of mm on function
names, then you are making a mistake.

The name describes reality closely enough.

>
> My personal feeling is it would misleading to readers. I would think if all
> pages mapping in VMA is Cow'd, the vma->anon_vma_chain becomes singular in
> current kernel.

Yup, you've said this already... :)

>
> A page-length comment is not we want, how about "maybe_root_anon_vma"? When
> vma->anon_vma_chain is empty or singular, it means the (future) vma->anon_vma
> is the root anon_vma.

No, that's a horrible name and your explanation is extremely
confusing. This is already a very confusing area of the kernel (one I am
really trying to improve upon, FWIW), and I don't think this change would
serve to improve things here.

Since you seem dead-set on this issue, I think the best solution here is
just for me to update the comment a little. I'll send a fix-patch.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ