lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250227012021.1778144-1-binbin.wu@linux.intel.com>
Date: Thu, 27 Feb 2025 09:20:01 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: pbonzini@...hat.com,
	seanjc@...gle.com,
	kvm@...r.kernel.org
Cc: rick.p.edgecombe@...el.com,
	kai.huang@...el.com,
	adrian.hunter@...el.com,
	reinette.chatre@...el.com,
	xiaoyao.li@...el.com,
	tony.lindgren@...el.com,
	isaku.yamahata@...el.com,
	yan.y.zhao@...el.com,
	chao.gao@...el.com,
	linux-kernel@...r.kernel.org,
	binbin.wu@...ux.intel.com
Subject: [PATCH v2 00/20] KVM: TDX: TDX "the rest" part

Hi,

This patch series adds the support for EPT violation/misconfig handling and
several TDVMCALL leaves, adds a bunch of wrappers to ignore the operations
not supported by TDX guests, and the document.

This patch series is the last part needed to provide the ability to run a
functioning TD VM.  We think this is in pretty good shape at this point and
ready for handoff to Paolo.


Base of this series
===================
This series is based on kvm-coco-queue up to the end of "TDX interrupts",
plus one PAT quirk series. Stack is:
 - '31db5921f12d ("KVM: TDX: Handle EXIT_REASON_OTHER_SMI")' from
   kvm-coco-queue.
 - PAT quirk series
   "KVM: x86: Introduce quirk KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT" [0].


Notable changes since v1 [1]
============================
Patch "KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched
behavior" is moved to "KVM: TDX: TD vcpu enter/exit" [2].

Rebased after adding tdcall_to_vmx_exit_reason() in [3] and the way to
get exit_qualification, ext_exit_qualification.

For EPT MISCONFIG, bug the VM and return -EIO.  The handling is deferred
until tdx_handle_exit() because tdx_to_vmx_exit_reason() is called by
'noinstr' code with interrupt disabled.

Add SEPT local retry and wait for SEPT zap logic to provide a clean
solution to avoid the blind SEPT retries.

Morph the following guest requested exit reasons (via TDVMCALL) to KVM's
tracked exit reasons:
 - Morph PV CPUID to EXIT_REASON_CPUID
 - Morph PV HLT to EXIT_REASON_HLT
 - Morph PV RDMSR to EXIT_REASON_RDMSR
 - Morph PV WRMSR to EXIT_REASON_WRMSR

Check RVI pending (bit 0 of TD_VCPU_STATE_DETAILS_NON_ARCH field) only for
HALTED case with IRQ enabled in tdx_protected_apic_has_interrupt().

For PV RDMSR/WRMSR handling, marshall values to the appropriate x86
registers to leverage the existing kvm_emulate_{rdmsr,wrmsr}(), and
implement complete_emulated_msr() callback to set return value/code to
vp_enter_args.

Skip setting of return code when the value is TDVMCALL_STATUS_SUCCESS
because r10 is always 0 for standard TDVMCALL exit.

Get/set tdvmcall inputs/outputs from/to vp_enter_args directly in struct
vcpu_tdx. After dropping helpers for read/write a0~a3 in [3].

Added back MTRR MSRs access, but drop the special handling for TDX guests,
just align with what KVM does for normal VMs.

Dropped tdx_cache_reg().

Updated documents.


TODO
====
Macrofy vt_x86_ops callbacks suggested by Sean. [4]


Overview
========
EPT violation
-------------
EPT violation for TDX will trigger X86 MMU code.
Note that instruction fetch from shared memory is not allowed for TDX
guests, if it occurs, treat it as broken hardware, bug the VM and return
error.
(*New Updated*)
SEPT local retry and wait for SEPT zap logic provides a clean solution to
avoid the blind SEPT retries.

EPT misconfiguration
--------------------
EPT misconfiguration shouldn't happen for TDX guests. If it occurs, bug the
VM and return error.

TDVMCALL support
----------------
Supports are added to allow TDX guests to issue CPUID, HLT, RDMSR/WRMSR and
GetTdVmCallInfo via TDVMCALL.

- CPUID
  For TDX, most CPUID leaf/sub-leaf combinations are virtualized by the TDX
  module while some trigger #VE.  On #VE, TDX guest can issue a TDVMCALL
  with the leaf Instruction.CPUID to request VMM to emulate CPUID
  operation.

- HLT
  TDX guest can issue a TDVMCALL with HLT, which passes the interrupt
  blocked flag. Whether the interrupt is allowed or not is depending on the
  interrupt blocked flag.  For NMI, KVM can't get the NMI blocked status of
  TDX guest, it always assumes NMI is allowed.

- MSRs
  Some MSRs are virtualized by TDX module directly, while some MSRs will
  trigger #VE when guest accesses them.  On #VE, TDX guests can issue a
  TDVMCALL with WRMSR or RDMSR to request emulation in VMM.

Operations ignored
------------------
TDX protects TDX guest state from VMM, and some features are not supported
by TDX guest, a bunch of operations are ignored for TDX guests, including:
accesses to CPU state, VMX preemption timer, accesses to TSC offset and 
multiplier, setup MCE for LMCE enable/disable, and hypercall patching.


Repos
=====
Due to "KVM: VMX: Move common fields of struct" in "TDX vcpu enter/exit" v2
[2], subsequent patches require changes to use new struct vcpu_vt, refer to
the full KVM branch below.

It requires TDX module 1.5.06.00.0744 [4], or later as mentioned in [2].
A working edk2 commit is 95d8a1c ("UnitTestFrameworkPkg: Use TianoCore
mirror of subhook submodule").

The full KVM branch is here:
https://github.com/intel/tdx/tree/tdx_kvm_dev-2025-02-26

A matching QEMU is here:
https://github.com/intel-staging/qemu-tdx/tree/tdx-qemu-wip-2025-02-18


Testing 
=======
It has been tested as part of the development branch for the TDX base
series. The testing consisted of TDX kvm-unit-tests and booting a Linux
TD, and TDX enhanced KVM selftests. It also passed the TDX related test
cases defined in the LKVS test suite as described in: 
https://github.com/intel/lkvs/blob/main/KVM/docs/lkvs_on_avocado.md


[0] https://lore.kernel.org/kvm/20250224070716.31360-1-yan.y.zhao@intel.com
[1] https://lore.kernel.org/kvm/20241210004946.3718496-1-binbin.wu@linux.intel.com
[2] https://lore.kernel.org/kvm/20250129095902.16391-1-adrian.hunter@intel.com
[3] https://lore.kernel.org/kvm/20250222014225.897298-1-binbin.wu@linux.intel.com
[4] https://lore.kernel.org/kvm/Z6v9yjWLNTU6X90d@google.com
[5] https://github.com/intel/tdx-module/releases/tag/TDX_1.5.06

Binbin Wu (1):
  KVM: TDX: Enable guest access to MTRR MSRs

Isaku Yamahata (16):
  KVM: TDX: Handle EPT violation/misconfig exit
  KVM: TDX: Handle TDX PV CPUID hypercall
  KVM: TDX: Handle TDX PV HLT hypercall
  KVM: x86: Move KVM_MAX_MCE_BANKS to header file
  KVM: TDX: Implement callbacks for MSR operations
  KVM: TDX: Handle TDX PV rdmsr/wrmsr hypercall
  KVM: TDX: Enable guest access to LMCE related MSRs
  KVM: TDX: Handle TDG.VP.VMCALL<GetTdVmCallInfo> hypercall
  KVM: TDX: Add methods to ignore accesses to CPU state
  KVM: TDX: Add method to ignore guest instruction emulation
  KVM: TDX: Add methods to ignore VMX preemption timer
  KVM: TDX: Add methods to ignore accesses to TSC
  KVM: TDX: Ignore setting up mce
  KVM: TDX: Add a method to ignore hypercall patching
  KVM: TDX: Make TDX VM type supported
  Documentation/virt/kvm: Document on Trust Domain Extensions (TDX)

Yan Zhao (3):
  KVM: TDX: Detect unexpected SEPT violations due to pending SPTEs
  KVM: TDX: Retry locally in TDX EPT violation handler on RET_PF_RETRY
  KVM: TDX: Kick off vCPUs when SEAMCALL is busy during TD page removal

 Documentation/virt/kvm/api.rst           |  13 +-
 Documentation/virt/kvm/x86/index.rst     |   1 +
 Documentation/virt/kvm/x86/intel-tdx.rst | 255 ++++++++++++
 arch/x86/include/asm/shared/tdx.h        |   1 +
 arch/x86/include/asm/vmx.h               |   2 +
 arch/x86/kvm/vmx/main.c                  | 482 ++++++++++++++++++++---
 arch/x86/kvm/vmx/posted_intr.c           |   3 +-
 arch/x86/kvm/vmx/tdx.c                   | 381 +++++++++++++++++-
 arch/x86/kvm/vmx/tdx.h                   |  16 +
 arch/x86/kvm/vmx/tdx_arch.h              |  13 +
 arch/x86/kvm/vmx/x86_ops.h               |   6 +
 arch/x86/kvm/x86.c                       |   1 -
 arch/x86/kvm/x86.h                       |   2 +
 13 files changed, 1113 insertions(+), 63 deletions(-)
 create mode 100644 Documentation/virt/kvm/x86/intel-tdx.rst

-- 
2.46.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ