linux-kernel - Re: [ANNOUNCE] PUCK Agenda - 2024.08.07 - KVM userfault (guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADrL8HXVNcbcuu9qF3wtkccpW6_QEnXQ1ViWEceeS9QGdQUTiw@mail.gmail.com>
Date: Wed, 7 Aug 2024 10:21:53 -0700
From: James Houghton <jthoughton@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Peter Xu <peterx@...hat.com>, Paolo Bonzini <pbonzini@...hat.com>, 
	Oliver Upton <oliver.upton@...ux.dev>, Axel Rasmussen <axelrasmussen@...gle.com>, 
	David Matlack <dmatlack@...gle.com>, Anish Moorthy <amoorthy@...gle.com>
Subject: Re: [ANNOUNCE] PUCK Agenda - 2024.08.07 - KVM userfault
 (guest_memfd/HugeTLB postcopy)

On Thu, Aug 1, 2024 at 3:44 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> Early warning for next week's PUCK since there's actually a topic this time.
> James is going to lead a discussion on KVM userfault[*](name subject to change).

Thanks for attending, everyone!

We seemed to arrive at the following conclusions:

1. For guest_memfd, stage 2 mapping installation will never go through
GUP / virtual addresses to do the GFN --> PFN translation, including
when it supports non-private memory.
2. Something like KVM Userfault is indeed necessary to handle
post-copy for guest_memfd VMs, especially when guest_memfd supports
non-private memory.
3. We should not hook into the overall GFN --> HVA translation, we
should only be hooking the GFN --> PFN translation steps to figure out
how to create stage 2 mappings. That is, KVM's own accesses to guest
memory should just go through mm/userfaultfd.
4. We don't need the concept of "async userfaults" (making KVM block
when attempting to access userfault memory) in KVM Userfault.

So I need to think more about what exactly the API should look like
for controlling if a page should exit to userspace before KVM is
allowed to map it into stage 2 and if this should apply to all of
guest memory or only guest_memfd.

It sounds like it may most likely be something like a per-VM bitmap
that describes which pages are allowed to be mapped into stage 2,
applying to all memory, not just guest_memfd memory. Even though it is
solving a problem for guest_memfd specifically, it is slightly cleaner
to have it apply to all memory.

If this per-VM bitmap applies to all memory, then we don't need to
wait for guest_memfd to support non-private memory before working on a
full implementation. But if not, perhaps it makes sense to wait.

There will be a 30 minute session at LPC to discuss this topic more. I
hope to see you there!

Here are the slides[2].

Thanks!

PS: I'll be away from August 9 - 25.

[2]: https://docs.google.com/presentation/d/1Al9amGumF3ZPX2Wu50mQ4nkPRZZdBJitXmMH3n7j_RE/edit?usp=sharing

> I Cc'd folks a few folks that I know are interested, please forward this on
> as needed.
>
> Early warning #2, PUCK is canceled for August 14th, as I'll be traveling, though
> y'all are welcome to meet without me.
>
> [*] https://lore.kernel.org/all/20240710234222.2333120-1-jthoughton@google.com
>
> Time:     6am PDT
> Video:    https://meet.google.com/vdb-aeqo-knk
> Phone:    https://tel.meet/vdb-aeqo-knk?pin=3003112178656
>
> Calendar: https://calendar.google.com/calendar/u/0?cid=Y182MWE1YjFmNjQ0NzM5YmY1YmVkN2U1ZWE1ZmMzNjY5Y2UzMmEyNTQ0YzVkYjFjN2M4OTE3MDJjYTUwOTBjN2Q1QGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20
> Drive:    https://drive.google.com/drive/folders/1aTqCrvTsQI9T4qLhhLs_l986SngGlhPH?resourcekey=0-FDy0ykM3RerZedI8R-zj4A&usp=drive_link
>
> Future Schedule:
> Augst   7th - KVM userfault
> August 14th - Canceled (Sean unavailable)
> August 21st - Available
> August 28th - Available