linux-kernel - Re: [PATCH] mm: fix copy_vma() error handling for hugetlb mappings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87iklrbo8f.fsf@igalia.com>
Date: Fri, 23 May 2025 12:44:32 +0200
From: Ricardo Cañuelo Navarro <rcn@...lia.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, "Liam R. Howlett"
 <Liam.Howlett@...cle.com>, Vlastimil Babka <vbabka@...e.cz>, Jann Horn
 <jannh@...gle.com>, Pedro Falcato <pfalcato@...e.de>, revest@...gle.com,
 kernel-dev@...lia.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org, Oscar Salvador <osalvador@...e.de>
Subject: Re: [PATCH] mm: fix copy_vma() error handling for hugetlb mappings

Hi Lorenzo,

Thanks for the in-depth review! answers below:

On Fri, May 23 2025 at 11:00:40, Lorenzo Stoakes <lorenzo.stoakes@...cle.com> wrote:

> OK so really it is _only_ when vma_link() fails?

AFAICT yes, since copy_vma() only calls vma_close() if vma_link()
fails. A failure in any of the other helpers in copy_vma() before it is
handled by simply freeing the allocated resources.

> Ordinarily 'private syzbot instance' makes me nervous, but you've made your case
> here logically.

I understand your qualms with that but, although that instance is mostly
concerned with downstream code, in this case there's nothing unusual, as
it was able to find the issue in mainline with a common reproducer. The
closest public report I found was the one I linked in [3], although I
couldn't reproduce the issue with the repro provided there.

> Hm, do we have a Fixes?

I couldn't find a single commit to point as a "Fixes". The actual commit
that introduces that close_vma() call there is
4080ef1579b2 ("mm: unconditionally close VMAs on error")
although I wouldn't say that's the culprit. As you said, the problem
with vma_close() seems to be more involved. If you want me to add that
one in the "Fixes" tag so we can keep track of the context, let me know,
that's fine by me.

> Why 6.12+? It seems this bug has been around for... a while.

Because in stable versions lower than that (6.6) the code to patch is in
mm/mmap.c instead, so I'd rather have this one merged first and then
submit the appropriate backport for 6.6.

> Thanks for links, though it's better to please provide this information here
> even if in succinct form. This is because commit messages are a permanent
> record, and these links (other than lore) are ephemeral.

True but, as you said, it's a bit of a pain to try to fit all the info
in the commit message, and the repro will still be living somewhere else.

> So, can we please copy/paste the splat from [1] and drop this link, maybe just
> keep link [2] as it's not so important (I'm guessing this takes a while to repro
> so the failure injection hits the right point?) and of course keep [3].

Sure, I'll make the changes for v2. FWIW, in my tests the repro could
trigger this in a matter of seconds.

> So,
>
> Could you implement this slightly differently please? We're duplicating
> this code now, so I think this should be in its own function with a copious
> comment.
>
> Something like:
>
> static void fixup_hugetlb_reservations(struct vm_area_struct *vma)
> {
> 	if (is_vm_hugetlb_page(new_vma))
> 		clear_vma_resv_huge_pages(new_vma);
> }
>
> And call this from here and also in copy_vma_and_data().
>
> Could you also please update the comment in clear_vma_resv_huge_pages():
>
> /*
>  * Reset and decrement one ref on hugepage private reservation.
>  * Called with mm->mmap_lock writer semaphore held.
>  * This function should be only used by move_vma() and operate on
>  * same sized vma. It should never come here with last ref on the
>  * reservation.
>  */
>
> Drop the mention of the specific function (which is now wrong, but
> mentioning _any_ function is asking for bit rot anyway) and replace with
> something like 'This function should only be used by mremap and...'

Ack, thanks for the suggestions!

Cheers,
Ricardo