lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aCFfVHFq_4enEgsL@hyeyoo>
Date: Mon, 12 May 2025 11:39:16 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        David Hildenbrand <david@...hat.com>,
        Matthew Wilcox <willy@...radead.org>, Rik van Riel <riel@...riel.com>,
        Wei Yang <richard.weiyang@...il.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/3] mm/vma: fix incorrectly disallowed anonymous VMA
 merges

On Tue, Apr 08, 2025 at 10:29:31AM +0100, Lorenzo Stoakes wrote:
> anon_vma_chain's were introduced by Rik von Riel in commit 5beb49305251 ("mm:
> change anon_vma linking to fix multi-process server scalability issue").
> 
> This patch was introduced in March 2010. As part of this change, careful
> attention was made to the instance of mprotect() causing a VMA merge, with one
> faulted (i.e. having anon_vma set) and another not:
> 
> 		/*
> 		 * Easily overlooked: when mprotect shifts the boundary,
> 		 * make sure the expanding vma has anon_vma set if the
> 		 * shrinking vma had, to cover any anon pages imported.
> 		 */
> 
> In the modern VMA code, this is handled in dup_anon_vma() (and ultimately
> anon_vma_clone()).
> 
> This case is one of the three configurations of adjacent VMA anon_vma state
> that we might encounter on merge (where dst is the VMA which will be merged
> into and src the one being merged into dst):
> 
> 1.  dst->anon_vma,  src->anon_vma - These must be equal, no-op.
> 2.  dst->anon_vma, !src->anon_vma - We simply use dst->anon_vma, no-op.
> 3. !dst->anon_vma,  src->anon_vma - The case in question here.
> 
> In case 3, the instance addressed here - we duplicate the AVC connections
> from src and place into dst.
> 
> However, in practice, we very often do NOT do this.
> 
> This appears to be due to an inadvertent consequence of the change
> introduced by commit 965f55dea0e3 ("mmap: avoid merging cloned VMAs"),
> introduced in May 2011.
> 
> This implies that this merge case was functional only for a little over a
> year, and has since been broken for ~15 years.
> 
> Here, lock scalability concerns lead to us restricting anonymous merges
> only to those VMAs with 1 entry in their vma->anon_vma_chain, that is, a
> VMA that is not connected to any parent process's anon_vma.
> 
> The mergeability test looks like this:
> 
> static inline bool is_mergeable_anon_vma(struct anon_vma *anon_vma1,
> 		 struct anon_vma *anon_vma2, struct vm_area_struct *vma)
> {
> 	if ((!anon_vma1 || !anon_vma2) && (!vma ||
> 		!vma->anon_vma || list_is_singular(&vma->anon_vma_chain)))
> 		return true;
> 	return anon_vma1 == anon_vma2;
> }
> 
> However, we have a problem here - typically the vma passed here is the
> destination VMA.
> 
> For instance in vma_merge_existing_range() we invoke:
> 
> can_vma_merge_left()
> -> [ check that there is an immediately adjacent prior VMA ]
> -> can_vma_merge_after()
>   -> is_mergeable_vma() for general attribute check
> -> is_mergeable_anon_vma([ proposed anon_vma ], prev->anon_vma, prev)
> 
> So if we were considering a target unfaulted 'prev':
> 
> 	  unfaulted    faulted
> 	|-----------|-----------|
> 	|    prev   |    vma    |
> 	|-----------|-----------|
> 
> This would call is_mergeable_anon_vma(NULL, vma->anon_vma, prev).
> 
> The list_is_singular() check for vma->anon_vma_chain, an empty list on
> fault, would cause this merge to _fail_ even though all else indicates a
> merge.
> 
> Equally a simple merge into a next VMA would hit the same problem:
> 
> 	   faulted    unfaulted
> 	|-----------|-----------|
> 	|    vma    |    next   |
> 	|-----------|-----------|
> 
> can_vma_merge_right()
> -> [ check that there is an immediately adjacent succeeding VMA ]
> -> can_vma_merge_before()
>   -> is_mergeable_vma() for general attribute check
> -> is_mergeable_anon_vma([ proposed anon_vma ], next->anon_vma, next)
> 
> For a 3-way merge, we'd also hit the same problem if it was configured like
> this for instance:
> 
> 	  unfaulted    faulted    unfaulted
> 	|-----------|-----------|-----------|
> 	|    prev   |    vma    |    next   |
> 	|-----------|-----------|-----------|
> 
> As we'd call can_vma_merge_left() for prev, and can_vma_merge_right() for
> next, both of which would fail.
> 
> vma_merge_new_range() (and relatedly, vma_expand()) are not impacted, as
> the new VMA would never already be faulted (it is a proposed new range).
> 
> Because we already handle each of the aforementioned merge cases, and can
> absolutely therefore deal with an existing VMA merge with !dst->anon_vma,
> src->anon_vma, there is absolutely no reason to disallow this kind of
> merge.
> 
> It seems that the intention of this patch is to ensure that, in the
> instance of merging unfaulted VMAs with faulted ones, we never wish to do
> so with those with multiple AVCs due to the fact that anon_vma lock's are
> held across both parent and child anon_vma's (actually, the 'root' parent
> anon_vma's lock is used).
> 
> In fact, the original commit alludes to this - "find_mergeable_anon_vma()
> already considers this case".
> 
> In find_mergeable_anon_vma() however, we check the anon_vma which will be
> merged from, if it is set, then we check
> list_is_singular(vma->anon_vma_chain).
> 
> So to match this logic, update is_mergeable_anon_vma() to perform this
> scalability check on the VMA whose anon_vma we ultimately merge into.
> 
> This matches existing behaviour with forked VMAs, only we no longer wrongly
> disallow ALL empty target merges.
> 
> So we both allow merge cases and ensure the scalability check is correctly
> applied.
> 
> We may wish to revisit these lock scalability concerns at a later date and
> ensure they are still valid.
> 
> Additionally, correct userland VMA tests which were mistakenly not
> asserting these cases correctly previously to now correctly assert this,
> and to ensure vmg->anon_vma state is always consistent to account for newly
> introduced asserts.
> 
> Fixes: 965f55dea0e3 ("mmap: avoid merging cloned VMAs")
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> Reviewed-by: Yeoreum Yun <yeoreum.yun@....com>
> ---

A little bit late to review, but better late than never :)

This patch makes sense and looks good to me,
Reviewed-by: Harry Yoo <harry.yoo@...cle.com>

-- 
Cheers,
Harry / Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ