[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240612185603.GK386318@ls.amr.corp.intel.com>
Date: Wed, 12 Jun 2024 11:56:03 -0700
From: Isaku Yamahata <isaku.yamahata@...el.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Cc: "pbonzini@...hat.com" <pbonzini@...hat.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
"Huang, Kai" <kai.huang@...el.com>,
"sagis@...gle.com" <sagis@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Aktas, Erdem" <erdemaktas@...gle.com>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>,
"dmatlack@...gle.com" <dmatlack@...gle.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>
Subject: Re: [PATCH v2 15/15] KVM: x86/tdp_mmu: Add a helper function to walk
down the TDP MMU
On Fri, Jun 07, 2024 at 11:39:14PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com> wrote:
> On Fri, 2024-06-07 at 11:31 +0200, Paolo Bonzini wrote:
> > > -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
> > > - int *root_level)
> > > +static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64
> > > *sptes,
> > > + enum kvm_tdp_mmu_root_types root_type)
> > > {
> > > - struct kvm_mmu_page *root = root_to_sp(vcpu->arch.mmu->root.hpa);
> > > + struct kvm_mmu_page *root = tdp_mmu_get_root(vcpu, root_type);
> >
> > I think this function should take the struct kvm_mmu_page * directly.
> >
> > > +{
> > > + *root_level = vcpu->arch.mmu->root_role.level;
> > > +
> > > + return __kvm_tdp_mmu_get_walk(vcpu, addr, sptes, KVM_DIRECT_ROOTS);
> >
> > Here you pass root_to_sp(vcpu->arch.mmu->root.hpa);
>
> I see. It is another case of more indirection to try to send the decision making
> through the helpers. We can try to open code things more.
>
> >
> > > +int kvm_tdp_mmu_get_walk_mirror_pfn(struct kvm_vcpu *vcpu, u64 gpa,
> > > + kvm_pfn_t *pfn)
> > > +{
> > > + u64 sptes[PT64_ROOT_MAX_LEVEL + 1], spte;
> > > + int leaf;
> > > +
> > > + lockdep_assert_held(&vcpu->kvm->mmu_lock);
> > > +
> > > + rcu_read_lock();
> > > + leaf = __kvm_tdp_mmu_get_walk(vcpu, gpa, sptes, KVM_MIRROR_ROOTS);
> >
> > and likewise here.
> >
> > You might also consider using a kvm_mmu_root_info for the mirror root,
> > even though the pgd field is not used.
>
> This came up on the last version actually. The reason against it was that it
> used that tiny bit of extra memory for the pgd. It does look more symmetrical
> though.
>
> >
> > Then __kvm_tdp_mmu_get_walk() can take a struct kvm_mmu_root_info * instead.
>
> Ahh, I see. Yes, that's a good reason.
>
> >
> > kvm_tdp_mmu_get_walk_mirror_pfn() doesn't belong in this series, but
> > introducing __kvm_tdp_mmu_get_walk() can stay here.
>
> Ok, we can split it.
>
> >
> > Looking at the later patch, which uses
> > kvm_tdp_mmu_get_walk_mirror_pfn(), I think this function is a bit
> > overkill. I'll do a pre-review of the init_mem_region function,
> > especially the usage of kvm_gmem_populate:
> >
> > + slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
> > + if (!kvm_slot_can_be_private(slot) || !kvm_mem_is_private(kvm, gfn)) {
> > + ret = -EFAULT;
> > + goto out_put_page;
> > + }
> >
> > The slots_lock is taken, so checking kvm_slot_can_be_private is unnecessary.
> >
> > Checking kvm_mem_is_private perhaps should also be done in
> > kvm_gmem_populate() itself. I'll send a patch.
> >
> > + read_lock(&kvm->mmu_lock);
> > +
> > + ret = kvm_tdp_mmu_get_walk_mirror_pfn(vcpu, gpa, &mmu_pfn);
> > + if (ret < 0)
> > + goto out;
> > + if (ret > PG_LEVEL_4K) {
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > + if (mmu_pfn != pfn) {
> > + ret = -EAGAIN;
> > + goto out;
> > + }
> >
> > If you require pre-faulting, you don't need to return mmu_pfn and
> > things would be seriously wrong if the two didn't match, wouldn't
> > they?
>
> Yea, I'm not sure why it would be a normal condition. Maybe Isaku can comment on
> the thinking?
Sean suggested for KVM_TDX_INIT_MEM_REGION to check if the two pfn from TDP MMU
and guest_memfd match. As pointed out, the two PFNs should match under
appropriate lock (or heavily broken). Personally I'm fine to remove such check
and to avoid returning pfn.
https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/
Then KVM provides a dedicated TDX ioctl(), i.e. what is/was KVM_TDX_INIT_MEM_REGION,
to do PAGE.ADD. KVM_TDX_INIT_MEM_REGION wouldn't need to map anything, it would
simply need to verify that the pfn from guest_memfd() is the same as what's in
the TDP MMU.
--
Isaku Yamahata <isaku.yamahata@...el.com>
Powered by blists - more mailing lists