[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55b727fc-8fd3-4e03-8143-1ed6dcab2781@redhat.com>
Date: Fri, 12 Sep 2025 17:39:08 +0200
From: David Hildenbrand <david@...hat.com>
To: kalyazin@...zon.com, James Houghton <jthoughton@...gle.com>,
"Kalyazin, Nikita" <kalyazin@...zon.co.uk>
Cc: "pbonzini@...hat.com" <pbonzini@...hat.com>,
"shuah@...nel.org" <shuah@...nel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"michael.day@....com" <michael.day@....com>,
"Roy, Patrick" <roypat@...zon.co.uk>, "Thomson, Jack"
<jackabt@...zon.co.uk>, "Manwaring, Derek" <derekmn@...zon.com>,
"Cali, Marco" <xmarcalx@...zon.co.uk>
Subject: Re: [PATCH v5 1/2] KVM: guest_memfd: add generic population via write
On 12.09.25 16:48, Nikita Kalyazin wrote:
>
>
> On 12/09/2025 14:36, David Hildenbrand wrote:
>> On 11.09.25 12:15, Nikita Kalyazin wrote:
>>>
>>>
>>> On 10/09/2025 22:23, James Houghton wrote:
>>>> On Tue, Sep 2, 2025 at 4:20 AM Kalyazin, Nikita
>>>> <kalyazin@...zon.co.uk> wrote:
>>>>>
>>>>> From: Nikita Kalyazin <kalyazin@...zon.com>
>>>>
>>>> Hi Nikita,
>>>
>>> Hi James,
>>>
>>> Thanks for the review!
>>>
>>>
>>>>>
>>>>> write syscall populates guest_memfd with user-supplied data in a
>>>>> generic
>>>>> way, ie no vendor-specific preparation is performed. This is supposed
>>>>> to be used in non-CoCo setups where guest memory is not
>>>>> hardware-encrypted.
>>>>
>>>> What's meant to happen if we do use this for CoCo VMs? I would expect
>>>> write() to fail, but I don't see why it would (seems like we need/want
>>>> a check that we aren't write()ing to private memory).
>>>
>>> I am not so sure that write() should fail even in CoCo VMs if we access
>>> not-yet-prepared pages. My understanding was that the CoCoisation of
>>> the memory occurs during "preparation". But I may be wrong here.
>>
>> But how do you handle that a page is actually inaccessible and should
>> not be touched?
>>
>> IOW, with CXL you could crash the host.
>>
>> There is likely some state check missing, or it should be restricted to
>> VM types.
>
> Sorry, I'm missing the link between VM types and CXL. How are they related?
I think what you explain below clarifies it.
>
> My thinking was it is a regular (accessible) page until it is "prepared"
> by the CoCo hardware, which is currently tracked by the up-to-date flag,
> so it is safe to assume that until it is "prepared", it is accessible
> because it was allocated by filemap_grab_folio() ->
> filemap_alloc_folio() and hasn't been taken over by the CoCo hardware.
> What scenario can you see where it doesn't apply as of now?
Thanks for clarifying, see below.
>
> I am aware of an attempt to remove preparation tracking from
> guest_memfd, but it is still at an RFC stage AFAIK [1].
>
>>
>> Do we know how this would interact with the direct-map removal?
>
> I'm using folio_test_uptodate() to determine if the page has been
> removed from the direct map as kvm_gmem_mark_prepared() is what
> currently removes the page from the direct map and marks it as
> up-to-date. [2] is a Firecracker feature branch where the two work in
> combination.
Ah, okay. Yes, I recalled [1] that we wanted to change these semantics
to be "uptodate: was zeroed", and that preparation handling would be
essentially handled by the arch backend.
--
Cheers
David / dhildenb
Powered by blists - more mailing lists