[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241112073327.21979-1-yan.y.zhao@intel.com>
Date: Tue, 12 Nov 2024 15:33:27 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: pbonzini@...hat.com,
seanjc@...gle.com,
kvm@...r.kernel.org,
dave.hansen@...ux.intel.com
Cc: rick.p.edgecombe@...el.com,
kai.huang@...el.com,
adrian.hunter@...el.com,
reinette.chatre@...el.com,
xiaoyao.li@...el.com,
tony.lindgren@...el.com,
binbin.wu@...ux.intel.com,
dmatlack@...gle.com,
isaku.yamahata@...el.com,
isaku.yamahata@...il.com,
nik.borisov@...e.com,
linux-kernel@...r.kernel.org,
x86@...nel.org,
Yan Zhao <yan.y.zhao@...el.com>
Subject: [PATCH v2 00/24] TDX MMU Part 2
Hi,
Here is v2 of the TDX “MMU part 2” series.
As discussed earlier, non-nit feedbacks from v1[0] have been applied.
- Among them, patch "KVM: TDX: MTRR: implement get_mt_mask() for TDX" was
dropped. The feature self-snoop was not made a dependency for enabling
TDX since checking for the feature self-snoop was not included in
kvm_mmu_may_ignore_guest_pat() in the base code. So, strickly speaking,
current code would incorrectly zap the mirrored root if non-coherent DMA
devices were hot-plugged.
There were also a few minor issues noticed by me and fixed without internal
discussion (noted in each patch's version log).
It’s now ready to hand off to Paolo/kvm-coco-queue.
One remaining item that requires further discussion is "How to handle
the TDX module lock contention (i.e. SEAMCALL retry replacements)".
The basis for future discussions includes:
(1) TDH.MEM.TRACK can contend with TDH.VP.ENTER on the TD epoch lock.
(2) TDH.VP.ENTER contends with TDH.MEM* on S-EPT tree lock when 0-stepping
mitigation is triggered.
- The threshold of zero-step mitigation is counted per-vCPU when the
TDX module finds that EPT violations are caused by the same RIP as
in the last TDH.VP.ENTER for 6 consecutive times.
The threshold value 6 is explained as
"There can be at most 2 mapping faults on instruction fetch
(x86 macro-instructions length is at most 15 bytes) when the
instruction crosses page boundary; then there can be at most 2
mapping faults for each memory operand, when the operand crosses
page boundary. For most of x86 macro-instructions, there are up to 2
memory operands and each one of them is small, which brings us to
maximum 2+2*2 = 6 legal mapping faults."
- If the EPT violations received by KVM are caused by
TDG.MEM.PAGE.ACCEPT, they will not trigger 0-stepping mitigation.
Since a TD is required to call TDG.MEM.PAGE.ACCEPT before accessing a
private memory when configured with pending_ve_disable=Y, 0-stepping
mitigation is not expected to occur in such a TD.
(3) TDG.MEM.PAGE.ACCEPT can contend with SEAMCALLs TDH.MEM*.
(Actually, TDG.MEM.PAGE.ATTR.RD or TDG.MEM.PAGE.ATTR.WR can also
contend with SEAMCALLs TDH.MEM*. Although we don't need to consider
these two TDCALLs when enabling basic TDX, they are allowed by the
TDX module, and we can't control whether a TD invokes a TDCALL or
not).
The "KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT" is
still in place in this series (at the tail), but we should drop it when we
finalize on the real solution.
This series has 5 commits intended to collect Acks from x86 maintainers.
These commits introduce and export SEAMCALL wrappers to allow KVM to manage
the S-EPT (the EPT that maps private memory and is protected by the TDX
module):
x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT
pages
x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial
contents
This series is based off of a kvm-coco-queue commit and some pre-req
series:
1. commit ee69eb746754 ("KVM: x86/mmu: Prevent aliased memslot GFNs") (in
kvm-coco-queue).
2. v7 of "TDX host: metadata reading tweaks, bug fix and info dump" [1].
3. v1 of "KVM: VMX: Initialize TDX when loading KVM module" [2], with some
new feedback from Sean.
4. v2 of “TDX vCPU/VM creation” [3]
It requires TDX module 1.5.06.00.0744[4], or later. This is due to removal
of the workarounds for the lack of the NO_RBP_MOD feature required by the
kernel. Now NO_RBP_MOD is enabled (in VM/vCPU creation patches), and this
particular version of the TDX module has a required NO_RBP_MOD related bug
fix.
A working edk2 commit is 95d8a1c ("UnitTestFrameworkPkg: Use TianoCore
mirror of subhook submodule").
The series has been tested as part of the development branch for the TDX
base series. The testing consisted of TDX kvm-unit-tests and booting a
Linux TD, and TDX enhanced KVM selftests.
The full KVM branch is here:
https://github.com/intel/tdx/tree/tdx_kvm_dev-2024-11-11.3
Matching QEMU:
https://github.com/intel-staging/qemu-tdx/commits/tdx-qemu-upstream-v6.1/
[0] https://lore.kernel.org/kvm/20240904030751.117579-1-rick.p.edgecombe@intel.com/
[1] https://lore.kernel.org/kvm/cover.1731318868.git.kai.huang@intel.com/#t
[2] https://lore.kernel.org/kvm/cover.1730120881.git.kai.huang@intel.com/
[3] https://lore.kernel.org/kvm/20241030190039.77971-1-rick.p.edgecombe@intel.com/
[4] https://github.com/intel/tdx-module/releases/tag/TDX_1.5.06
Isaku Yamahata (17):
KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU
KVM: TDX: Add accessors VMX VMCS helpers
KVM: TDX: Set gfn_direct_bits to shared bit
x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT
pages
x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial
contents
KVM: TDX: Require TDP MMU and mmio caching for TDX
KVM: x86/mmu: Add setter for shadow_mmio_value
KVM: TDX: Set per-VM shadow_mmio_value to 0
KVM: TDX: Handle TLB tracking for TDX
KVM: TDX: Implement hooks to propagate changes of TDP MMU mirror page
table
KVM: TDX: Implement hook to get max mapping level of private pages
KVM: TDX: Add an ioctl to create initial guest memory
KVM: TDX: Finalize VM initialization
KVM: TDX: Handle vCPU dissociation
Rick Edgecombe (3):
KVM: x86/mmu: Implement memslot deletion for TDX
KVM: VMX: Teach EPT violation helper about private mem
KVM: x86/mmu: Export kvm_tdp_map_page()
Sean Christopherson (2):
KVM: VMX: Split out guts of EPT violation to common/exposed function
KVM: TDX: Add load_mmu_pgd method for TDX
Yan Zhao (1):
KVM: x86/mmu: Do not enable page track for TD guest
Yuan Yao (1):
[HACK] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand
SEPT
arch/x86/include/asm/tdx.h | 9 +
arch/x86/include/asm/vmx.h | 1 +
arch/x86/include/uapi/asm/kvm.h | 10 +
arch/x86/kvm/mmu.h | 4 +
arch/x86/kvm/mmu/mmu.c | 7 +-
arch/x86/kvm/mmu/page_track.c | 3 +
arch/x86/kvm/mmu/spte.c | 8 +-
arch/x86/kvm/mmu/tdp_mmu.c | 37 +-
arch/x86/kvm/vmx/common.h | 43 ++
arch/x86/kvm/vmx/main.c | 104 ++++-
arch/x86/kvm/vmx/tdx.c | 727 +++++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/tdx.h | 93 ++++
arch/x86/kvm/vmx/tdx_arch.h | 23 +
arch/x86/kvm/vmx/vmx.c | 25 +-
arch/x86/kvm/vmx/x86_ops.h | 51 +++
arch/x86/virt/vmx/tdx/tdx.c | 176 ++++++++
arch/x86/virt/vmx/tdx/tdx.h | 8 +
virt/kvm/kvm_main.c | 1 +
18 files changed, 1278 insertions(+), 52 deletions(-)
create mode 100644 arch/x86/kvm/vmx/common.h
--
2.43.2
Powered by blists - more mailing lists