[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b38d1ea3073cdda0f106313d9f0e032345b8b75.camel@intel.com>
Date: Tue, 12 Mar 2024 12:38:15 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>, "Yamahata, Isaku"
<isaku.yamahata@...el.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "federico.parola@...ito.it"
<federico.parola@...ito.it>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "isaku.yamahata@...il.com"
<isaku.yamahata@...il.com>, "michael.roth@....com" <michael.roth@....com>,
"dmatlack@...gle.com" <dmatlack@...gle.com>
Subject: Re: [RFC PATCH 6/8] KVM: x86: Implement kvm_arch_{,
pre_}vcpu_map_memory()
>
> Wait. KVM doesn't *need* to do PAGE.ADD from deep in the MMU. The only inputs to
> PAGE.ADD are the gfn, pfn, tdr (vm), and source. The S-EPT structures need to be
> pre-built, but when they are built is irrelevant, so long as they are in place
> before PAGE.ADD.
>
> Crazy idea. For TDX S-EPT, what if KVM_MAP_MEMORY does all of the SEPT.ADD stuff,
> which doesn't affect the measurement, and even fills in KVM's copy of the leaf EPTE,
> but tdx_sept_set_private_spte() doesn't do anything if the TD isn't finalized?
>
> Then KVM provides a dedicated TDX ioctl(), i.e. what is/was KVM_TDX_INIT_MEM_REGION,
> to do PAGE.ADD. KVM_TDX_INIT_MEM_REGION wouldn't need to map anything, it would
> simply need to verify that the pfn from guest_memfd() is the same as what's in
> the TDP MMU.
One small question:
What if the memory region passed to KVM_TDX_INIT_MEM_REGION hasn't been pre-
populated? If we want to make KVM_TDX_INIT_MEM_REGION work with these regions,
then we still need to do the real map. Or we can make KVM_TDX_INIT_MEM_REGION
return error when it finds the region hasn't been pre-populated?
>
> Or if we want to make things more robust for userspace, set a software-available
> flag in the leaf TDP MMU SPTE to indicate that the page is awaiting PAGE.ADD.
> That way tdp_mmu_map_handle_target_level() wouldn't treat a fault as spurious
> (KVM will see the SPTE as PRESENT, but the S-EPT entry will be !PRESENT).
>
> Then KVM_MAP_MEMORY doesn't need to support @source, KVM_TDX_INIT_MEM_REGION
> doesn't need to fake a page fault and doesn't need to temporarily stash the
> source_pa in KVM, and KVM_MAP_MEMORY could be used to fully pre-map TDX memory.
>
> I believe the only missing piece is a way for the TDX code to communicate that
> hugepages are disallowed.
>
Powered by blists - more mailing lists