linux-kernel - Re: [PATCH] mm: fix copy_vma() error handling for hugetlb mappings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0b2a5a80-0709-452f-9815-018cc1cd14fb@lucifer.local>
Date: Fri, 23 May 2025 12:19:04 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Ricardo Cañuelo Navarro <rcn@...lia.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        Pedro Falcato <pfalcato@...e.de>, revest@...gle.com,
        kernel-dev@...lia.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        Oscar Salvador <osalvador@...e.de>
Subject: Re: [PATCH] mm: fix copy_vma() error handling for hugetlb mappings

On Fri, May 23, 2025 at 12:44:32PM +0200, Ricardo Cañuelo Navarro wrote:
> Hi Lorenzo,
>
> Thanks for the in-depth review! answers below:
>
> On Fri, May 23 2025 at 11:00:40, Lorenzo Stoakes <lorenzo.stoakes@...cle.com> wrote:
>
> > OK so really it is _only_ when vma_link() fails?
>
> AFAICT yes, since copy_vma() only calls vma_close() if vma_link()
> fails. A failure in any of the other helpers in copy_vma() before it is
> handled by simply freeing the allocated resources.
>
> > Ordinarily 'private syzbot instance' makes me nervous, but you've made your case
> > here logically.
>
> I understand your qualms with that but, although that instance is mostly
> concerned with downstream code, in this case there's nothing unusual, as
> it was able to find the issue in mainline with a common reproducer. The
> closest public report I found was the one I linked in [3], although I
> couldn't reproduce the issue with the repro provided there.

Yeah as I said in this case we're good :)

The issue has really been instances where people are running modified copies
that are giving what appear to be spurious reports.

This is not the case here!

>
>
> > Hm, do we have a Fixes?
>
> I couldn't find a single commit to point as a "Fixes". The actual commit
> that introduces that close_vma() call there is
> 4080ef1579b2 ("mm: unconditionally close VMAs on error")
> although I wouldn't say that's the culprit. As you said, the problem
> with vma_close() seems to be more involved. If you want me to add that
> one in the "Fixes" tag so we can keep track of the context, let me know,
> that's fine by me.

Yeah, fair enough.

I don't think that commit is the culprit, as it still does essentially the same
logic, it just also updates vma->vm_ops to prevent any risk of double-closing.

I suspect this has been a 'long term' bug, but one again that really is unlikely
to be triggered in reality.

So probably no Fixes really makes sense here.

>
> > Why 6.12+? It seems this bug has been around for... a while.
>
> Because in stable versions lower than that (6.6) the code to patch is in
> mm/mmap.c instead, so I'd rather have this one merged first and then
> submit the appropriate backport for 6.6.

You can backport everything manually. Stable side of things won't affect this
being merged upstream.

Also you're going to (probably) hit merge conflicts anyway pre my refactoring of
mremap.c.

So if you feel this should get fixed everywhere, then you can always do # >=
5.4.293 or something and fix things up as you go with manual backports.

But again, I'm not sure this is really worth backporting _at all_ or at least
_that far back_ given how it is more or less impossible to hit in reality.

I think under the kind of memory pressure that would result in this bug (which
I'm not sure can even actually happen, unless a fatal signal arose at the same
time, perhaps), hugetlb reservation miscount would be the absolute least of your
concern.

So my recommendation would really to avoid a backport here,
>
> > Thanks for links, though it's better to please provide this information here
> > even if in succinct form. This is because commit messages are a permanent
> > record, and these links (other than lore) are ephemeral.
>
> True but, as you said, it's a bit of a pain to try to fit all the info
> in the commit message, and the repro will still be living somewhere else.

Right, but then we have more information. It's a trade-off obviously.

>
> > So, can we please copy/paste the splat from [1] and drop this link, maybe just
> > keep link [2] as it's not so important (I'm guessing this takes a while to repro
> > so the failure injection hits the right point?) and of course keep [3].
>
> Sure, I'll make the changes for v2. FWIW, in my tests the repro could
> trigger this in a matter of seconds.

Interesting :) I can't repro in qemu at least. I may be misconfiguring this
somehow, however.

>
> > So,
> >
> > Could you implement this slightly differently please? We're duplicating
> > this code now, so I think this should be in its own function with a copious
> > comment.
> >
> > Something like:
> >
> > static void fixup_hugetlb_reservations(struct vm_area_struct *vma)
> > {
> > 	if (is_vm_hugetlb_page(new_vma))
> > 		clear_vma_resv_huge_pages(new_vma);
> > }
> >
> > And call this from here and also in copy_vma_and_data().
> >
> > Could you also please update the comment in clear_vma_resv_huge_pages():
> >
> > /*
> >  * Reset and decrement one ref on hugepage private reservation.
> >  * Called with mm->mmap_lock writer semaphore held.
> >  * This function should be only used by move_vma() and operate on
> >  * same sized vma. It should never come here with last ref on the
> >  * reservation.
> >  */
> >
> > Drop the mention of the specific function (which is now wrong, but
> > mentioning _any_ function is asking for bit rot anyway) and replace with
> > something like 'This function should only be used by mremap and...'
>
> Ack, thanks for the suggestions!

Thanks again for reporting this :)

>
> Cheers,
> Ricardo