[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8abe599-f48f-4203-8c60-9ee776aa4a24@amazon.com>
Date: Mon, 7 Apr 2025 12:04:28 +0100
From: Nikita Kalyazin <kalyazin@...zon.com>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>, Ackerley Tng
<ackerleytng@...gle.com>, Vishal Annapurve <vannapurve@...gle.com>, "Fuad
Tabba" <tabba@...gle.com>, <akpm@...ux-foundation.org>,
<pbonzini@...hat.com>, <shuah@...nel.org>, <viro@...iv.linux.org.uk>,
<brauner@...nel.org>, <muchun.song@...ux.dev>, <hughd@...gle.com>,
<kvm@...r.kernel.org>, <linux-kselftest@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<linux-fsdevel@...r.kernel.org>, <jack@...e.cz>,
<lorenzo.stoakes@...cle.com>, <jannh@...gle.com>, <ryan.roberts@....com>,
<david@...hat.com>, <jthoughton@...gle.com>, <peterx@...hat.com>,
<graf@...zon.de>, <jgowans@...zon.com>, <roypat@...zon.co.uk>,
<derekmn@...zon.com>, <nsaenz@...zon.es>, <xmarcalx@...zon.com>
Subject: Re: [PATCH v3 0/6] KVM: guest_memfd: support for uffd minor
On 04/04/2025 18:12, Liam R. Howlett wrote:
> +To authors of v7 series referenced in [1]
>
> * Nikita Kalyazin <kalyazin@...zon.com> [250404 11:44]:
>> This series is built on top of the Fuad's v7 "mapping guest_memfd backed
>> memory at the host" [1].
>
> I didn't see their addresses in the to/cc, so I added them to my
> response as I reference the v7 patch set below.
Hi Liam,
Thanks for the feedback and for extending the list.
>
>>
>> With James's KVM userfault [2], it is possible to handle stage-2 faults
>> in guest_memfd in userspace. However, KVM itself also triggers faults
>> in guest_memfd in some cases, for example: PV interfaces like kvmclock,
>> PV EOI and page table walking code when fetching the MMIO instruction on
>> x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3]
>> that KVM would be accessing those pages via userspace page tables.
>
> Thanks for being open about the technical call, but it would be better
> to capture the reasons and not the call date. I explain why in the
> linking section as well.
Thanks for bringing that up. The document mostly contains the decision
itself. The main alternative considered previously was a temporary
reintroduction of the pages to the direct map whenever a KVM-internal
access is required. It was coming with a significant complexity of
guaranteeing correctness in all cases [1]. Since the memslot structure
already contains a guest memory pointer supplied by the userspace, KVM
can use it directly when in the VMM or vCPU context. I will add this in
the cover for the next version.
[1]
https://lore.kernel.org/kvm/20240709132041.3625501-1-roypat@amazon.co.uk/T/#m4f367c52bbad0f0ba7fb07ca347c7b37258a73e5
>
>> In
>> order for such faults to be handled in userspace, guest_memfd needs to
>> support userfaultfd.
>>
>> Changes since v2 [4]:
>> - James: Fix sgp type when calling shmem_get_folio_gfp
>> - James: Improved vm_ops->fault() error handling
>> - James: Add and make use of the can_userfault() VMA operation
>> - James: Add UFFD_FEATURE_MINOR_GUEST_MEMFD feature flag
>> - James: Fix typos and add more checks in the test
>>
>> Nikita
>
> Please slow down...
>
> This patch is at v3, the v7 patch that you are building off has lockdep
> issues [1] reported by one of the authors, and (sorry for sounding harsh
> about the v7 of that patch) the cover letter reads a bit more like an
> RFC than a set ready to go into linux-mm.
AFAIK the lockdep issue was reported on a v7 of a different change.
I'm basing my series on [2] ("KVM: Mapping guest_memfd backed memory at
the host for software protected VMs"), while the issue was reported on
[2] ("KVM: Restricted mapping of guest_memfd at the host and arm64
support"), which is also built on top of [2]. Please correct me if I'm
missing something.
The key feature that is required by my series is the ability to mmap
guest_memfd when the VM type allows. My understanding is no-one is
opposed to that as of now, that's why I assumed it's safe to build on
top of that.
[2] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/
[3]
https://lore.kernel.org/all/diqz1puanquh.fsf@ackerleytng-ctop.c.googlers.com/T/
>
> Maybe the lockdep issue is just a patch ordering thing or removed in a
> later patch set, but that's not mentioned in the discovery email?
>
> What exactly is the goal here and the path forward for the rest of us
> trying to build on this once it's in mm-new/mm-unstable?
>
> Note that mm-unstable is shared with a lot of other people through
> linux-next, and we are really trying to stop breaking stuff on them.
>
> Obviously v7 cannot go in until it works with lockdep - otherwise none
> of us can use lockdep which is not okay.
>
> Also, I am concerned about the amount of testing in the v7 and v3 patch
> sets that did not bring up a lockdep issue..
>
>>
>> [1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/
>> [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/
>> [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3
>
> If there is anything we need to know about the decisions in the call and
> that document, can you please pull it into this change log?
>
> I don't think anyone can ensure google will not rename docs to some
> other office theme tomorrow - as they famously ditch basically every
> name and application.
>
> Also, most of the community does not want to go to a 17 page (and
> growing) spreadsheet to hunt down the facts when there is an acceptable
> and ideal place to document them in git. It's another barrier of entry
> on reviewing your code as well.
>
> But please, don't take this suggestion as carte blanche for copying a
> conversation from the doc, just give us the technical reasons for your
> decisions as briefly as possible.
>
>
>> [4] https://lore.kernel.org/kvm/20250402160721.97596-1-kalyazin@amazon.com/T/
>
> [1]. https://lore.kernel.org/all/diqz1puanquh.fsf@ackerleytng-ctop.c.googlers.com/
>
> Thanks,
> Liam
Powered by blists - more mailing lists