[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8589fbd-733d-42ae-a6a7-8683c77a4817@amazon.com>
Date: Tue, 26 Nov 2024 16:04:52 +0000
From: Nikita Kalyazin <kalyazin@...zon.com>
To: David Hildenbrand <david@...hat.com>, <pbonzini@...hat.com>,
<corbet@....net>, <kvm@...r.kernel.org>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
CC: <jthoughton@...gle.com>, <brijesh.singh@....com>, <michael.roth@....com>,
<graf@...zon.de>, <jgowans@...zon.com>, <roypat@...zon.co.uk>,
<derekmn@...zon.com>, <nsaenz@...zon.es>, <xmarcalx@...zon.com>, "Sean
Christopherson" <seanjc@...gle.com>, <linux-mm@...ck.org>
Subject: Re: [RFC PATCH 0/4] KVM: ioctl for populating guest_memfd
On 21/11/2024 16:46, Nikita Kalyazin wrote:
>
>
> On 20/11/2024 18:29, David Hildenbrand wrote:
> > Any clue how your new ioctl will interact with the WIP to have shared
> > memory as part of guest_memfd? For example, could it be reasonable to
> > "populate" the shared memory first (via VMA) and then convert that
> > "allocated+filled" memory to private?
>
> Patrick and I synced internally on this. What may actually work for
> guest_memfd population is the following.
>
> Non-CoCo use case:
> - fallocate syscall to fill the page cache, no page content
> initialisation (like it is now)
> - pwrite syscall to initialise the content + mark up-to-date (mark
> prepared), no specific preparation logic is required
>
> The pwrite will have "once" semantics until a subsequent
> fallocate(FALLOC_FL_PUNCH_HOLE), ie the next pwrite call will "see" the
> page is already prepared and return EIO/ENOSPC or something.
I prototyped that to see if it was possible (and it was). Actually the
write syscall can also do the allocation part, so no prior fallocate
would be required. The only thing is there is a cap on how much IO can
be done in a single call (MAX_RW_COUNT) [1], but it doesn't look like a
significant problem. Does it sound like an acceptable solution?
[1]: https://elixir.bootlin.com/linux/v6.12.1/source/fs/read_write.c#L507
>
> SEV-SNP use case (no changes):
> - fallocate as above
> - KVM_SEV_SNP_LAUNCH_UPDATE to initialise/prepare
>
> We don't think fallocate/pwrite have dependencies on current->mm
> assumptions that Paolo mentioned in [1], so they should be safe to be
> called on guest_memfd from a non-VMM process.
>
> [1]: https://lore.kernel.org/kvm/20241024095429.54052-1-
> kalyazin@...zon.com/T/#m57498f8e2fde577ad1da948ec74dd2225cd2056c
>
> > Makes sense. Best we can do is:
> >
> > anon: work only on page tables
> > shmem/guest_memfd: work only on pageacache
> >
> > So at least "only one treelike structure to update".
>
> This seems to hold with the above reasoning.
>
> > --
>> Cheers,
>>
>> David / dhildenb
>
Powered by blists - more mailing lists