linux-kernel - Re: [PATCH] KVM: guest_memfd: Disable VMA merging with VM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAEvNRgHX7MPSBX7pMeSWEtzc0-bJhAZ=pv+WF0VtOv9Tx0Jpxw@mail.gmail.com>
Date: Wed, 4 Feb 2026 15:17:05 -0800
From: Ackerley Tng <ackerleytng@...gle.com>
To: "David Hildenbrand (arm)" <david@...nel.org>, Sean Christopherson <seanjc@...gle.com>
Cc: syzbot+33a04338019ac7e43a44@...kaller.appspotmail.com, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, pbonzini@...hat.com, 
	syzkaller-bugs@...glegroups.com, michael.roth@....com, vannapurve@...gle.com, 
	kartikey406@...il.com
Subject: Re: [PATCH] KVM: guest_memfd: Disable VMA merging with VM_DONTEXPAND

"David Hildenbrand (arm)" <david@...nel.org> writes:

> On 2/4/26 22:37, Sean Christopherson wrote:
>> On Wed, Feb 04, 2026, Ackerley Tng wrote:
>>> Ackerley Tng <ackerleytng@...gle.com> writes:
>>>
>>>> #syz test: git://git.kernel.org/pub/scm/virt/kvm/kvm.git next
>>>>
>>>> guest_memfd VMAs don't need to be merged,
>>
>> Why not?  There are benefits to merging VMAs that have nothing to do with folios.
>> E.g. map 1GiB of guest_memfd with 512*512 4KiB VMAs, and then it becomes quite
>> desirable to merge all of those VMAs into one.
>>

I didn't realise VM_DONTEXPAND's no expansion policy extends to the case
where adjacent VMAs with the same flags, etc automatically merge. Since
VM_DONTEXPAND blocks this kind of expansion, I agree VM_DONTEXPAND is
not great.

>> Creating _hugepages_ doesn't add value, but that's not the same things as merging
>> VMAs.
>>
>>>> especially now, since guest_memfd only supports PAGE_SIZE folios.
>>>>
>>>> Set VM_DONTEXPAND on guest_memfd VMAs.
>>>
>>> Local tests and syzbot agree that this fixes the issue identified. :)
>>>
>>> I would like to look into madvise(MADV_COLLAPSE) and uprobes triggering
>>> mapping/folio collapsing before submitting a full patch series.
>>>
>>> David, Michael, Vishal, what do you think of the choice of setting
>>> VM_DONTEXPAND to disable khugepaged?
>>
>> I'm not one of the above, but for me it feels very much like treating a symptom

Was going to find some solution before getting to you to save you some
time :)

>> and not fixing the underlying cause.
>
> And you are spot-on :)
>
>>
>> It seems like what KVM should do is not block one path that triggers hugepage
>> processing, but instead flat out disallow creating hugepages.  Unfortunately,

__filemap_get_folio_mpol(), which we use in kvm_gmem_get_folio(), looks
up mapping_min_folio_order() to determine what order to allocate. I
think we could lock that down to always use order 0. I tried that here
[1] but in this case khugepaged allocates new folios for guest_memfd
(and others) directly in collapse_file(), explicitly specifying
PMD_ORDER.

I took a look and wasn't able to find a central callback/ops to catch
all fs allocations.

[1] https://lore.kernel.org/all/6982553e.a00a0220.34fa92.0009.GAE@google.com/

>> AFAICT, there's no existing way to prevent madvise() from clearing VM_NOHUGEPAGE,
>> so we can't simply force that flag.
>>
>> I'd prefer not to special case guest_memfd, a la devdax, but I also want to address
>> this head-on, not by removing a tangentially related trigger.
>
> VM_NOHUGEPAGE also smells like the wrong thing. This is a file limitation.
>
> !thp_vma_allowable_order() must take care of that somehow down in
> __thp_vma_allowable_orders(), by checking the file).
>
> Likely the file_thp_enabled() check is the culprit with
> CONFIG_READ_ONLY_THP_FOR_FS?
>
> Maybe we need a flag to say "even not CONFIG_READ_ONLY_THP_FOR_FS".
>
> I wonder how we handle that for secretmem. Too late for me, going to bed :)
>

Let me look deeper into this. Thanks!

> --
> Cheers,
>
> David