lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 20 May 2024 11:58:17 -0700
From: Isaku Yamahata <isaku.yamahata@...el.com>
To: "Huang, Kai" <kai.huang@...el.com>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
	"Yamahata, Isaku" <isaku.yamahata@...el.com>,
	"dmatlack@...gle.com" <dmatlack@...gle.com>,
	"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>,
	"seanjc@...gle.com" <seanjc@...gle.com>,
	"sagis@...gle.com" <sagis@...gle.com>,
	"isaku.yamahata@...ux.intel.com" <isaku.yamahata@...ux.intel.com>,
	"Aktas, Erdem" <erdemaktas@...gle.com>,
	"Zhao, Yan Y" <yan.y.zhao@...el.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"pbonzini@...hat.com" <pbonzini@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for
 TDP MMU

On Mon, May 20, 2024 at 10:38:58AM +0000,
"Huang, Kai" <kai.huang@...el.com> wrote:

> On Sat, 2024-05-18 at 15:41 +0000, Edgecombe, Rick P wrote:
> > On Sat, 2024-05-18 at 05:42 +0000, Huang, Kai wrote:
> > > 
> > > No.  I meant "using kvm_mmu_page.role.mirrored_pt to determine whether to
> > > invoke kvm_x86_ops::xx_private_spt()" is not correct.
> > 
> > I agree this looks wrong.
> > 
> > >   Instead, we should
> > > use fault->is_private to determine:
> > > 
> > >         if (fault->is_private && kvm_x86_ops::xx_private_spt())
> > >                 kvm_x86_ops::xx_private_spte();
> > >         else
> > >                 // normal TDP MMU operation
> > > 
> > > The reason is this pattern works not just for TDX, but also for SNP (and
> > > SW_PROTECTED_VM) if they ever need specific page table ops.

Do you want to split the concept from invoking hooks from mirrored PT
and to allow invoking hooks even for shared PT (probably without
mirrored PT)?  So far I tied the mirrored PT to invoking the hooks as
those hooks are to reflect the changes on mirrored PT to private PT.

Is there any use case to allow hook for shared PT?

- SEV_SNP
  Although I can't speak for SNP folks, I guess they don't need hooks.
  I guess they want to stay away from directly modifying the TDP MMU
  (to add TDP MMU hooks).  Instead, They added hooks to guest_memfd.
  RMP (Reverse mapping table) doesn't have to be consistent with NPT.

  Anyway, I'll reply to
  https://lore.kernel.org/lkml/20240501085210.2213060-1-michael.roth@amd.com/T/#m8ca554a6d4bad7fa94dedefcf5914df19c9b8051
 
TDX
  I don't see immediate need to allow hooks for shared PT.

SW_PROTECTED (today)
  It uses only shared PT and don't need hooks.

SW_PROTECTED (with mirrored pt with shared mask in future in theory)
  This would be similar to TDX, we wouldn't need hooks for shared PT.

SW_PROTECTED (shared PT only without mirrored pt in future in theory)
  I don't see necessity hooks for shared PT.
  (Or I don't see value of this SW_PROTECTED case.)


> > I think the problem is there are a lot of things that are more on the mirrored
> > concept side:
> >  - Allocating the "real" PTE pages (i.e. sp->private_spt)
> >  - Setting the PTE when the mirror changes
> >  - Zapping the real PTE when the mirror is zapped (and there is no fault)
> >  - etc
> > 
> > And on the private side there is just knowing that private faults should operate
> > on the mirror root.
> 
> ... and issue SEAMCALL to operate the real private page table?

For zapping case,
- SEV-SNP
  They use the hook for guest_memfd.
- SW_PROTECTED (with mirrored pt in future in theory)
  This would be similar to TDX.


> > The xx_private_spte() operations are actually just updating the real PTE for the
> > mirror. In some ways it doesn't have to be about "private". It could be a mirror
> > of something else and still need the updates. For SNP and others they don't need
> > to do anything like that. (AFAIU)
> 
> AFAICT xx_private_spte() should issue SEAMCALL to operate the real private
> page table?
> 
> > 
> > So based on that, I tried to change the naming of xx_private_spt() to reflect
> > that. Like:
> > if (role.mirrored)
> >   update_mirrored_pte()
> > 
> > The TDX code could encapsulate that mirrored updates need to update private EPT.
> > Then I had a helper that answered the question of whether to handle private
> > faults on the mirrored root.
> 
> I am fine with this too, but I am also fine with the existing pattern:
> 
> That we update the mirrored_pt using normal TDP MMU operation, and then
> invoke the xx_private_spte() for private GPA.
> 
> My only true comment is, to me it seems more reasonable to invoke
> xx_private_spte() based on fault->is_private, but not on
> 'use_mirrored_pt'.
> 
> See my reply to your question whether SNP needs special handling below.
> 
> > 
> > The FREEZE stuff actually made a bit more sense too, because it was clear it
> > wasn't a special TDX private memory thing, but just about the atomicity.
> > 
> > The problem was I couldn't get rid of all special things that are private (can't
> > remember what now).
> > 
> > I wonder if I should give it a more proper try. What do you think?
> > 
> > At this point, I was just going to change the "mirrored" name to
> > "private_mirrored". Then code that does either mirrored things or private things
> > both looks correct. Basically making it clear that the MMU only supports
> > mirroring private memory.
> 
> I don't have preference on name.  "mirrored_private" also works for me.

For hook names, we can use mirrored_private or reflect or handle?
(or whatever better name)

The current hook names
  {link, free}_private_spt(),
  {set, remove, zap}_private_spte()

=>
  # use mirrored_private
  {link, free}_mirrored_private_spt(),
  {set, remove, zap}_mirrored_private_spte()

  or 
  # use reflect (update or handle?) mirrored to private
  reflect_{linked, freeed}_mirrored_spt(),
  reflect_{set, removed, zapped}_mirrored_spte()

  or 
  # Don't add anything.  I think this would be confusing. 
  {link, free}_spt(),
  {set, remove, zap}_spte()


I think we should also rename the internal functions in TDP MMU.
- handle_removed_private_spte()
- set_private_spte_present()
handle and set is inconsistent. They should have consistent name.

=>
handle_{removed, set}_mirrored_private_spte()
or 
reflect_{removed, set}_mirrored_spte()


> > >         bool mirrored_pt = fault->is_private && kvm_use_mirrored_pt(kvm);
> > > 
> > >         tdp_mmu_for_each_pte(iter, mmu, mirrored_pt, raw_gfn, raw_gfn +
> > > 1) 
> > >         {
> > >                 ...
> > >         }
> > > 
> > > #define tdp_mmu_for_each_pte(_iter, _mmu, _mirrored_pt, _start, _end)   \
> > >         for_each_tdp_pte(_iter,                                         \
> > >                  root_to_sp((_mirrored_pt) ? _mmu->private_root_hpa :   \
> > >                                 _mmu->root.hpa),                        \
> > >                 _start, _end)
> > > 
> > > If you somehow needs the mirrored_pt in later time when handling the page
> > > fault, you don't need another "mirrored_pt" in tdp_iter, because you can
> > > easily get it from the sptep (or just get from the root):
> > > 
> > >         mirrored_pt = sptep_to_sp(sptep)->role.mirrored_pt;
> > > 
> > > What we really need to pass in is the fault->is_private, because we are
> > > not able to get whether a GPN is private based on kvm_shared_gfn_mask()
> > > for SNP and SW_PROTECTED_VM.
> > 
> > SNP and SW_PROTECTED_VM (today) don't need do anything special here, right?
> 
> Conceptually, I think SNP also needs to at least issue some command(s) to
> update the RMP table to reflect the GFN<->PFN relationship.  From this
> point, I do see a fit.
> 
> I briefly looked into SNP patchset, and I also raised the discussion there
> (with you and Isaku copied):
> 
> https://lore.kernel.org/lkml/20240501085210.2213060-1-michael.roth@amd.com/T/#m8ca554a6d4bad7fa94dedefcf5914df19c9b8051
> 
> I could be wrong, though.

I'll reply to it.
-- 
Isaku Yamahata <isaku.yamahata@...el.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ