[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZkvrFw1QMtImegQD@google.com>
Date: Mon, 20 May 2024 17:30:15 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Kai Huang <kai.huang@...el.com>
Cc: "isaku.yamahata@...il.com" <isaku.yamahata@...il.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Rick P Edgecombe <rick.p.edgecombe@...el.com>, "michael.roth@....com" <michael.roth@....com>,
"pankaj.gupta@....com" <pankaj.gupta@....com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"tobin@....com" <tobin@....com>, "liam.merwick@...cle.com" <liam.merwick@...cle.com>,
"alpergun@...gle.com" <alpergun@...gle.com>, Tony Luck <tony.luck@...el.com>,
"jmattson@...gle.com" <jmattson@...gle.com>, "luto@...nel.org" <luto@...nel.org>,
"ak@...ux.intel.com" <ak@...ux.intel.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"pgonda@...gle.com" <pgonda@...gle.com>,
"srinivas.pandruvada@...ux.intel.com" <srinivas.pandruvada@...ux.intel.com>, "slp@...hat.com" <slp@...hat.com>,
"rientjes@...gle.com" <rientjes@...gle.com>, "peterz@...radead.org" <peterz@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"dovmurik@...ux.ibm.com" <dovmurik@...ux.ibm.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>, "x86@...nel.org" <x86@...nel.org>, "bp@...en8.de" <bp@...en8.de>,
"vkuznets@...hat.com" <vkuznets@...hat.com>, "vbabka@...e.cz" <vbabka@...e.cz>,
"ashish.kalra@....com" <ashish.kalra@....com>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
"nikunj.dadhania@....com" <nikunj.dadhania@....com>, Jorg Rodel <jroedel@...e.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"sathyanarayanan.kuppuswamy@...ux.intel.com" <sathyanarayanan.kuppuswamy@...ux.intel.com>,
"hpa@...or.com" <hpa@...or.com>, "kirill@...temov.name" <kirill@...temov.name>,
"jarkko@...nel.org" <jarkko@...nel.org>, "ardb@...nel.org" <ardb@...nel.org>,
"linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH v15 13/20] KVM: SEV: Implement gmem hook for initializing
private pages
On Tue, May 21, 2024, Kai Huang wrote:
> On 21/05/2024 11:15 am, Sean Christopherson wrote:
> > On Tue, May 21, 2024, Kai Huang wrote:
> > > On 21/05/2024 5:35 am, Sean Christopherson wrote:
> > > > On Mon, May 20, 2024, Kai Huang wrote:
> > > > > I am wondering whether this can be done in the KVM page fault handler?
> > > >
> > > > No, because the state of a pfn in the RMP is tied to the guest_memfd inode,
> > > > not to the file descriptor, i.e. not to an individual VM.
> > >
> > > It's strange that as state of a PFN of SNP doesn't bind to individual VM, at
> > > least for the private pages. The command rpm_make_private() indeed reflects
> > > the mapping between PFN <-> <GFN, SSID>.
> >
> > s/SSID/ASID
> >
> > KVM allows a single ASID to be bound to multiple "struct kvm" instances, e.g.
> > for intra-host migration. If/when trusted I/O is a thing, presumably KVM will
> > also need to share the ASID with other entities, e.g. IOMMUFD.
>
> But is this the case for SNP? I thought due to the nature of private pages,
> they cannot be shared between VMs? So to me this RMP entry mapping for PFN
> <-> GFN for private page should just be per-VM.
Sorry to redirect, but please read this mail (and probably surrounding mails).
It hopefully explains most of the question you have.
https://lore.kernel.org/all/ZLGiEfJZTyl7M8mS@google.com
> > Regardless of whether or not guest_memfd supports page migration, KVM needs to
> > track the state of the physical page in guest_memfd, e.g. if it's been assigned
> > to the ASID versus if it's still in a shared state.
>
> I am not certain this can impact whether we want to do RMP commands via
> guest_memfd() hooks or TDP MMU hooks?
>
> > > If we truly want to zap private mappings for SNP, IIUC it can be done by
> > > distinguishing whether a VM needs to use a separate private table, which is
> > > TDX-only.
> >
> > I wouldn't say we "want" to zap private mappings for SNP, rather that it's a lot
> > less work to keep KVM's existing behavior (literally do nothing) than it is to
> > rework the MMU and whatnot to not zap SPTEs.
>
> My thinking too.
>
> > And there's no big motivation to avoid zapping because SNP VMs are unlikely
> > to delete memslots.
>
> I think we should also consider MMU notifier?
No, private mappings have no host userspace mappings, i.e. are completely exempt
from MMU notifier events. guest_memfd() can still invalidate mappings, but that
only occurs if userspace punches a hole, which is destructive.
> > If it turns out that it's easy to preserve SNP mappings after TDX lands, then we
> > can certainly go that route, but AFAIK there's no reason to force the issue.
>
> No I am certainly not saying we should do SNP after TDX. Sorry I didn't
> closely monitor the status of this SNP patchset.
>
> My intention is just wanting to make the TDP MMU common code change more
> useful (since we need that for TDX anyway), i.e., not effectively just for
> TDX if possible:
>
> Currently the TDP MMU hooks are called depending whether the page table type
> is private (or mirrored whatever), but I think conceptually, we should
> decide whether to call TDP MMU hooks based on whether faulting GPA is
> private, _AND_ when the hook is available.
>
> https://lore.kernel.org/lkml/5e8119c0-31f5-4aa9-a496-4ae10bd745a3@intel.com/
>
> If invoking SNP RMP commands is feasible in TDP MMU hooks,
Feasible. Yes. Desirable? No. Either KVM tracks the state of the physical page
using the guest_memfd inode, or KVM _guarantees_ the NPT mappings _never_ get
dropped, including during intra-host migration. E.g. to support intra-host
migration of TDX VMs, KVM is pretty much forced to transer the S-EPT tables as-is,
which is ugly and painful (though performant). We could do the same for NPT, but
there would need to be massive performance benefits to justify the complexity.
> then I think there's value of letting SNP code to use them too. And we can
> simply split one patch out to only add the TDP MMU hooks for SNP to land
> first.
Powered by blists - more mailing lists