[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zh8DHbb8FzoVErgX@google.com>
Date: Tue, 16 Apr 2024 16:00:45 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@...el.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Isaku Yamahata <isaku.yamahata@...el.com>, Kai Huang <kai.huang@...el.com>,
"federico.parola@...ito.it" <federico.parola@...ito.it>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>, "dmatlack@...gle.com" <dmatlack@...gle.com>,
"michael.roth@....com" <michael.roth@....com>
Subject: Re: [PATCH v2 07/10] KVM: x86: Always populate L1 GPA for KVM_MAP_MEMORY
On Tue, Apr 16, 2024, Paolo Bonzini wrote:
> On Mon, Apr 15, 2024 at 11:17 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > The first question to answer is, do we want to return an error or "silently"
> > install mappings for !SMM, !guest_mode. And so this option becomes relevant only
> > _if_ we want to unconditionally install mappings for the 'base" mode.
> >
> > > > - Return error on guest mode or SMM mode: Without this patch.
> > > > Pros: No additional patch.
> > > > Cons: Difficult to use.
> > >
> > > Hmm... For the non-TDX use cases this is just an optimization, right? For TDX
> > > there shouldn't be an issue. If so, maybe this last one is not so horrible.
>
> It doesn't even have to be ABI that it gives an error. As you say,
> this ioctl can just be advisory only for !confidential machines. Even
> if it were implemented, the shadow MMU can drop roots at any moment
Sure, but there's a difference between KVM _potentially_ dropping roots and
guaranteed failure because userspace is trying to do something that's unsupported.
But I think this is a non-issue, because it should really just be as simple as:
if (!mmu->pre_map_memory)
return -EOPNOTSUPP;
Hmm, or probably this to avoid adding an MMU hook for a single MMU flavor:
if (!tdp_mmu_enabled || !mmu->root_role.direct)
return -EOPNOTSUPP;
> and/or kill the mapping via the shrinker.
Ugh, we really need to kill that code.
> That said, I can't fully shake the feeling that this ioctl should be
> an error for !TDX and that TDX_INIT_MEM_REGION wasn't that bad. The
> implementation was ugly but the API was fine.
Hmm, but IMO the implementation was ugly in no small part because of the contraints
put on KVM by the API. Mapping S-EPT *and* doing TDH.MEM.PAGE.ADD in the same
ioctl() forced KVM to operate on vcpu0, and necessitated shoving temporary data
into a per-VM structure in order to get the source contents into TDH.MEM.PAGE.ADD.
We could eliminate the vcpu0 grossness, but it would require a massive refactor,
which is also not a problem per se, but it's obviously not free. Eliminating
kvm_tdx.source_page is also doable, but it's not clear to me that end result would
be a net positive.
If userspace pre-maps the S-EPT entries ahead of time, then KVM should have a
straight shot to PAGE.ADD, i.e. doesn't need to "pass" the source page via a
scratch field in kvm_tdx, and I think/hope would avoid the need to grab vcpu0
in order to get at an MMU to build the S-EPT.
And stating the obvious, TDX_INIT_MEM_REGION also doesn't allow pre-mapping memory,
which is generally useful, and can be especially beneficial for confidential VMs
(and TDX in particular) due to the added cost of a page fault VM-Exit.
I'm not dead set on this generic ioctl(), but unless it ends up being a train wreck
for userspace, I think it will allow for cleaner and more reusable code in KVM.
Powered by blists - more mailing lists