lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 31 May 2022 16:46:02 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     isaku.yamahata@...el.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Cc:     isaku.yamahata@...il.com, erdemaktas@...gle.com,
        Sean Christopherson <seanjc@...gle.com>,
        Sagi Shahar <sagis@...gle.com>
Subject: Re: [RFC PATCH v6 000/104] KVM TDX basic feature support

On 5/5/22 20:13, isaku.yamahata@...el.com wrote:
> From: Isaku Yamahata <isaku.yamahata@...el.com>
> 
> Hello.  This is v6 the patch series vof KVM TDX support.
> This is based on v5.18-rc3 + kvm/queue branch + TDX HOST patch series.
> The tree can be found at https://github.com/intel/tdx/tree/kvm-upstream
> 
> Major changes from v5:
> - initialize TDX module on loading kvm_intel.ko
>    This requires changes to other arch. I compile-tested only other arch.
>    Needs review by each KVM arch maintainer.
> - introduced protected apic suggested by Sean Christopherson <seanjc@...gle.com>
> - use constants for non-present SPTE value
>    I tested on VMX, but complie test only for SVM.
> - introduced debug mode to enable #VE suppressbit for VMX and warn on #VE exit
> 
> TODO:
> - 2M large page support. It's work-in-progress.

So the only important conflicts are with the PRIVATE mapping series
(see reply to patch 47) and with commit ba3a6120a4e7:

     Author: Sean Christopherson <seanjc@...gle.com>
     Date:   Sat Apr 23 03:47:43 2022 +0000

     KVM: x86/mmu: Use atomic XCHG to write TDP MMU SPTEs with volatile bits

which are a bit boring but not hard.  If you can post a v7 relatively
soon I'd be grateful.

Paolo

> How to run/test:
> It's describe at
> https://github.com/intel/tdx/blob/kvm-upstream-workaround/KVM-TDX.README.md
> 
> Trello:
> I've created to track details. If you want to update items, please let me know.
> https://trello.com/b/B1cLGCcA/kvm-tdx
> 
> Thanks,
> Isaku Yamahata
> 
> Changes from v5:
> - export __seamcall and use it
> - move mutex lock from callee function of smp_call_on_cpu to the caller.
> - rename mmu_prezap => flush_shadow_all_private() and tdx_mmu_release_hkid
> - updated comment
> - drop the use of tdh_mng_key.reclaimid(): as the function is for backward
>    compatibility to only return success
> - struct kvm_tdx_cmd: metadata => flags, added __u64 error.
> - make this ioctl systemwide ioctl
> - ABI change to struct kvm_init_vm
> - guest_tsc_khz: use kvm->arch.default_tsc_khz
> - rename BUILD_BUG_ON_MEMCPY to MEMCPY_SAME_SIZE
> - drop exporting kvm_set_tsc_khz().
> - fix kvm_tdp_page_fault() for mtrr emulation
> - rename it to kvm_gfn_shared_mask(), dropped kvm_gpa_shared_mask()
> - drop kvm_is_private_gfn(), kept kvm_is_private_gpa()
>    keep kvm_{gfn, gpa}_private(), kvm_gpa_private()
> - update commit message
> - rename shadow_init_value => shadow_nonprsent_value
> - added ept_violation_ve_test mode
> - shadow_nonpresent_value => SHADOW_NONPRESENT_VALUE in tdp_mmu.c
> - legacy MMU case
>    => - mmu_topup_shadow_page_cache(), kvm_mmu_create()
>       - FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
> - #VE warning:
> - rename: REMOVED_SPTE => __REMOVED_SPTE, SHADOW_REMOVED_SPTE => REMOVED_SPTE
> - merge into Like we discussed, this patch should be merged with patch
>    "KVM: x86/mmu: Allow non-zero init value for shadow PTE".
> - fix pointed by Sagi. check !is_private check => (kvm_gfn_shared_mask && !is_private)
> - introduce kvm_gfn_for_root(kvm, root, gfn)
> - add only_shared argument to kvm_tdp_mmu_handle_gfn()
> - use kvm_arch_dirty_log_supported()
> - rename SPTE_PRIVATE_PROHIBIT to SPTE_SHARED_MASK.
> - rename: is_private_prohibit_spte() => spte_shared_mask()
> - fix: shadow_nonpresent_value => SHADOW_NONPRESENT_VALUE in comment
> - dropped this patch as the change was merged into kvm/queue
> - update vt_apicv_post_state_restore()
> - use is_64_bit_hypercall()
> - comment: expand MSMI -> Machine Check System Management Interrupt
> - fixed TDX_SEPT_PFERR
> - tdvmcall_p[1234]_{write, read}() => tdvmcall_a[0123]_{read,write}()
> - rename tdmvcall_exit_readon() => tdvmcall_leaf()
> - remove optional zero check of argument.
> - do a check for static_call(kvm_x86_has_emulated_msr)(kvm, MSR_IA32_SMBASE)
>     in kvm_vcpu_ioctl_smi and __apic_accept_irq.
> - WARN_ON_ONCE in tdx_smi_allowed and tdx_enable_smi_window.
> - introduce vcpu_deliver_init to x86_ops
> - sprinkeled KVM_BUG_ON()
> 
> Changes from v4:
> - rebased to TDX host kernel patch series.
> - include all the patches to make this patch series working.
> - add [MARKER] patches to mark the patch layer clear.
> 
> ---
> * What's TDX?
> TDX stands for Trust Domain Extensions, which extends Intel Virtual Machines
> Extensions (VMX) to introduce a kind of virtual machine guest called a Trust
> Domain (TD) for confidential computing.
> 
> A TD runs in a CPU mode that is designed to protect the confidentiality of its
> memory contents and its CPU state from any other software, including the hosting
> Virtual Machine Monitor (VMM), unless explicitly shared by the TD itself.
> 
> We have more detailed explanations below (***).
> We have the high-level design of TDX KVM below (****).
> 
> In this patch series, we use "TD" or "guest TD" to differentiate it from the
> current "VM" (Virtual Machine), which is supported by KVM today.
> 
> 
> * The organization of this patch series
> This patch series is on top of the patches series "TDX host kernel support":
> https://lore.kernel.org/lkml/cover.1646007267.git.kai.huang@intel.com/
> 
> this patch series is available at
> https://github.com/intel/tdx/releases/tag/kvm-upstream
> The corresponding patches to qemu are available at
> https://github.com/intel/qemu-tdx/commits/tdx-upstream
> 
> The relations of the layers are depicted as follows.
> The arrows below show the order of patch reviews we would like to have.
> 
> The below layers are chosen so that the device model, for example, qemu can
> exercise each layering step by step.  Check if TDX is supported, create TD VM,
> create TD vcpu, allow vcpu running, populate TD guest private memory, and handle
> vcpu exits/hypercalls/interrupts to run TD fully.
> 
>    TDX vcpu
>    interrupt/exits/hypercall<------------\
>          ^                               |
>          |                               |
>    TD finalization                       |
>          ^                               |
>          |                               |
>    TDX EPT violation<------------\       |
>          ^                       |       |
>          |                       |       |
>    TD vcpu enter/exit            |       |
>          ^                       |       |
>          |                       |       |
>    TD vcpu creation/destruction  |       \-------KVM TDP MMU MapGPA
>          ^                       |                       ^
>          |                       |                       |
>    TD VM creation/destruction    \---------------KVM TDP MMU hooks
>          ^                                               ^
>          |                                               |
>    TDX architectural definitions                 KVM TDP refactoring for TDX
>          ^                                               ^
>          |                                               |
>     TDX, VMX    <--------TDX host kernel         KVM MMU GPA share mask
>     coexistence          support
> 
> 
> The followings are explanations of each layer.  Each layer has a dummy commit
> that starts with [MARKER] in subject.  It is intended to help to identify where
> each layer starts.
> 
> TDX host kernel support:
>          https://lore.kernel.org/lkml/cover.1646007267.git.kai.huang@intel.com/
>          The guts of system-wide initialization of TDX module.  There is an
>          independent patch series for host x86.  TDX KVM patches call functions
>          this patch series provides to initialize the TDX module.
> 
> TDX, VMX coexistence:
>          Infrastructure to allow TDX to coexist with VMX and trigger the
>          initialization of the TDX module.
>          This layer starts with
>          "KVM: VMX: Move out vmx_x86_ops to 'main.c' to wrap VMX and TDX"
> TDX architectural definitions:
>          Add TDX architectural definitions and helper functions
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: TDX architectural definitions".
> TD VM creation/destruction:
>          Guest TD creation/destroy allocation and releasing of TDX specific vm
>          and vcpu structure.  Create an initial guest memory image with TDX
>          measurement.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: TD VM creation/destruction".
> TD vcpu creation/destruction:
>          guest TD creation/destroy Allocation and releasing of TDX specific vm
>          and vcpu structure.  Create an initial guest memory image with TDX
>          measurement.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: TD vcpu creation/destruction"
> TDX EPT violation:
>          Create an initial guest memory image with TDX measurement.  Handle
>          secure EPT violations to populate guest pages with TDX SEAMCALLs.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: TDX EPT violation"
> TD vcpu enter/exit:
>          Allow TDX vcpu to enter into TD and exit from TD.  Save CPU state before
>          entering into TD.  Restore CPU state after exiting from TD.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: TD vcpu enter/exit"
> TD vcpu interrupts/exit/hypercall:
>          Handle various exits/hypercalls and allow interrupts to be injected so
>          that TD vcpu can continue running.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: TD vcpu exits/interrupts/hypercalls"
> 
> KVM MMU GPA shared bit:
>          Introduce framework to handle shared bit repurposed bit of GPA TDX
>          repurposed a bit of GPA to indicate shared or private. If it's shared,
>          it's the same as the conventional VMX EPT case.  VMM can access shared
>          guest pages.  If it's private, it's handled by Secure-EPT and the guest
>          page is encrypted.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: KVM MMU GPA stolen bits"
> KVM TDP refactoring for TDX:
>          TDX Secure EPT requires different constants. e.g. initial value EPT
>          entry value etc. Various refactoring for those differences.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: KVM TDP refactoring for TDX"
> KVM TDP MMU hooks:
>          Introduce framework to TDP MMU to add hooks in addition to direct EPT
>          access TDX added Secure EPT which is an enhancement to VMX EPT.  Unlike
>          conventional VMX EPT, CPU can't directly read/write Secure EPT. Instead,
>          use TDX SEAMCALLs to operate on Secure EPT.
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: KVM TDP MMU hooks"
> KVM TDP MMU MapGPA:
>          Introduce framework to handle switching guest pages from private/shared
>          to shared/private.  For a given GPA, a guest page can be assigned to a
>          private GPA or a shared GPA exclusively.  With TDX MapGPA hypercall,
>          guest TD converts GPA assignments from private (or shared) to shared (or
>          private).
>          This layer starts with
>          "[MARKER] The start of TDX KVM patch series: KVM TDP MMU MapGPA "
> 
> KVM guest private memory: (not shown in the above diagram)
> [PATCH v4 00/12] KVM: mm: fd-based approach for supporting KVM guest private
> memory: https://lkml.org/lkml/2022/1/18/395
>          Guest private memory requires different memory management in KVM.  The
>          patch proposes a way for it.  Integration with TDX KVM.
> 
> (***)
> * TDX module
> A CPU-attested software module called the "TDX module" is designed to implement
> the TDX architecture, and it is loaded by the UEFI firmware today. It can be
> loaded by the kernel or driver at runtime, but in this patch series we assume
> that the TDX module is already loaded and initialized.
> 
> The TDX module provides two main new logical modes of operation built upon the
> new SEAM (Secure Arbitration Mode) root and non-root CPU modes added to the VMX
> architecture. TDX root mode is mostly identical to the VMX root operation mode,
> and the TDX functions (described later) are triggered by the new SEAMCALL
> instruction with the desired interface function selected by an input operand
> (leaf number, in RAX). TDX non-root mode is used for TD guest operation.  TDX
> non-root operation (i.e. "guest TD" mode) is similar to the VMX non-root
> operation (i.e. guest VM), with changes and restrictions to better assure that
> no other software or hardware has direct visibility of the TD memory and state.
> 
> TDX transitions between TDX root operation and TDX non-root operation include TD
> Entries, from TDX root to TDX non-root mode, and TD Exits from TDX non-root to
> TDX root mode.  A TD Exit might be asynchronous, triggered by some external
> event (e.g., external interrupt or SMI) or an exception, or it might be
> synchronous, triggered by a TDCALL (TDG.VP.VMCALL) function.
> 
> TD VCPUs can be entered using SEAMCALL(TDH.VP.ENTER) by KVM. TDH.VP.ENTER is one
> of the TDX interface functions as mentioned above, and "TDH" stands for Trust
> Domain Host. Those host-side TDX interface functions are categorized into
> various areas just for better organization, such as SYS (TDX module management),
> MNG (TD management), VP (VCPU), PHYSMEM (physical memory), MEM (private memory),
> etc. For example, SEAMCALL(TDH.SYS.INFO) returns the TDX module information.
> 
> TDCS (Trust Domain Control Structure) is the main control structure of a guest
> TD, and encrypted (using the guest TD's ephemeral private key).  At a high
> level, TDCS holds information for controlling TD operation as a whole,
> execution, EPTP, MSR bitmaps, etc that KVM needs to set it up.  Note that MSR
> bitmaps are held as part of TDCS (unlike VMX) because they are meant to have the
> same value for all VCPUs of the same TD.
> 
> Trust Domain Virtual Processor State (TDVPS) is the root control structure of a
> TD VCPU.  It helps the TDX module control the operation of the VCPU, and holds
> the VCPU state while the VCPU is not running. TDVPS is opaque to software and
> DMA access, accessible only by using the TDX module interface functions (such as
> TDH.VP.RD, TDH.VP.WR). TDVPS includes TD VMCS, and TD VMCS auxiliary structures,
> such as virtual APIC page, virtualization exception information, etc.
> 
> Several VMX control structures (such as Shared EPT and Posted interrupt
> descriptor) are directly managed and accessed by the host VMM.  These control
> structures are pointed to by fields in the TD VMCS.
> 
> The above means that 1) KVM needs to allocate different data structures for TDs,
> 2) KVM can reuse the existing code for TDs for some operations, 3) it needs to
> define TD-specific handling for others.  3) Redirect operations to .  3)
> Redirect operations to the TDX specific callbacks, like "if (is_td_vcpu(vcpu))
> tdx_callback() else vmx_callback();".
> 
> *TD Private Memory
> TD private memory is designed to hold TD private content, encrypted by the CPU
> using the TD ephemeral key. An encryption engine holds a table of encryption
> keys, and an encryption key is selected for each memory transaction based on a
> Host Key Identifier (HKID). By design, the host VMM does not have access to the
> encryption keys.
> 
> In the first generation of MKTME, HKID is "stolen" from the physical address by
> allocating a configurable number of bits from the top of the physical
> address. The HKID space is partitioned into shared HKIDs for legacy MKTME
> accesses and private HKIDs for SEAM-mode-only accesses. We use 0 for the shared
> HKID on the host so that MKTME can be opaque or bypassed on the host.
> 
> During TDX non-root operation (i.e. guest TD), memory accesses can be qualified
> as either shared or private, based on the value of a new SHARED bit in the Guest
> Physical Address (GPA).  The CPU translates shared GPAs using the usual VMX EPT
> (Extended Page Table) or "Shared EPT" (in this document), which resides in host
> VMM memory. The Shared EPT is directly managed by the host VMM - the same as
> with the current VMX. Since guest TDs usually require I/O, and the data exchange
> needs to be done via shared memory, thus KVM needs to use the current EPT
> functionality even for TDs.
> 
> * Secure EPT and Minoring using the TDP code
> The CPU translates private GPAs using a separate Secure EPT.  The Secure EPT
> pages are encrypted and integrity-protected with the TD's ephemeral private
> key.  Secure EPT can be managed _indirectly_ by the host VMM, using the TDX
> interface functions, and thus conceptually Secure EPT is a subset of EPT (why
> "subset"). Since execution of such interface functions takes much longer time
> than accessing memory directly, in KVM we use the existing TDP code to minor the
> Secure EPT for the TD.
> 
> This way, we can effectively walk Secure EPT without using the TDX interface
> functions.
> 
> * VM life cycle and TDX specific operations
> The userspace VMM, such as QEMU, needs to build and treat TDs differently.  For
> example, a TD needs to boot in private memory, and the host software cannot copy
> the initial image to private memory.
> 
> * TSC Virtualization
> The TDX module helps TDs maintain reliable TSC (Time Stamp Counter) values
> (e.g. consistent among the TD VCPUs) and the virtual TSC frequency is determined
> by TD configuration, i.e. when the TD is created, not per VCPU.  The current KVM
> owns TSC virtualization for VMs, but the TDX module does for TDs.
> 
> * MCE support for TDs
> The TDX module doesn't allow VMM to inject MCE.  Instead PV way is needed for TD
> to communicate with VMM.  For now, KVM silently ignores MCE request by VMM.  MSRs
> related to MCE (e.g, MCE bank registers) can be naturally emulated by
> paravirtualizing MSR access.
> 
> [1] For details, the specifications, [2], [3], [4], [5], [6], [7], are
> available.
> 
> * Restrictions or future work
> Some features are not included to reduce patch size.  Those features are
> addressed as future independent patch series.
> - large page (2M, 1G)
> - qemu gdb stub
> - guest PMU
> - and more
> 
> * Prerequisites
> It's required to load the TDX module and initialize it.  It's out of the scope
> of this patch series.  Another independent patch for the common x86 code is
> planned.  It defines CONFIG_INTEL_TDX_HOST and this patch series uses
> CONFIG_INTEL_TDX_HOST.  It's assumed that With CONFIG_INTEL_TDX_HOST=y, the TDX
> module is initialized and ready for KVM to use the TDX module APIs for TDX guest
> life cycle like tdh.mng.init are ready to use.
> 
> Concretely Global initialization, LP (Logical Processor) initialization, global
> configuration, the key configuration, and TDMR and PAMT initialization are done.
> The state of the TDX module is SYS_READY.  Please refer to the TDX module
> specification, the chapter Intel TDX Module Lifecycle State Machine
> 
> ** Detecting the TDX module readiness.
> TDX host patch series implements the detection of the TDX module availability
> and its initialization so that KVM can use it.  Also it manages Host KeyID
> (HKID) assigned to guest TD.
> The assumed APIs the TDX host patch series provides are
> - int seamrr_enabled()
>    Check if required cpu feature (SEAM mode) is available. This only check CPU
>    feature availability.  At this point, the TDX module may not be ready for KVM
>    to use.
> - int init_tdx(void);
>    Initialization of TDX module so that the TDX module is ready for KVM to use.
> - const struct tdsysinfo_struct *tdx_get_sysinfo(void);
>    Return the system wide information about the TDX module.  NULL if the TDX
>    isn't initialized.
> - u32 tdx_get_global_keyid(void);
>    Return global key id that is used for the TDX module itself.
> - int tdx_keyid_alloc(void);
>    Allocate HKID for guest TD.
> - void tdx_keyid_free(int keyid);
>    Free HKID for guest TD.
> 
> (****)
> * TDX KVM high-level design
> - Host key ID management
> Host Key ID (HKID) needs to be assigned to each TDX guest for memory encryption.
> It is assumed The TDX host patch series implements necessary functions,
> u32 tdx_get_global_keyid(void), int tdx_keyid_alloc(void) and,
> void tdx_keyid_free(int keyid).
> 
> - Data structures and VM type
> Because TDX is different from VMX, define its own VM/VCPU structures, struct
> kvm_tdx and struct vcpu_tdx instead of struct kvm_vmx and struct vcpu_vmx.  To
> identify the VM, introduce VM-type to specify which VM type, VMX (default) or
> TDX, is used.
> 
> - VM life cycle and TDX specific operations
> Re-purpose the existing KVM_MEMORY_ENCRYPT_OP to add TDX specific operations.
> New commands are used to get the TDX system parameters, set TDX specific VM/VCPU
> parameters, set initial guest memory and measurement.
> 
> The creation of TDX VM requires five additional operations in addition to the
> conventional VM creation.
>    - Get KVM system capability to check if TDX VM type is supported
>    - VM creation (KVM_CREATE_VM)
>    - New: Get the TDX specific system parameters.  KVM_TDX_GET_CAPABILITY.
>    - New: Set TDX specific VM parameters.  KVM_TDX_INIT_VM.
>    - VCPU creation (KVM_CREATE_VCPU)
>    - New: Set TDX specific VCPU parameters.  KVM_TDX_INIT_VCPU.
>    - New: Initialize guest memory as boot state and extend the measurement with
>      the memory.  KVM_TDX_INIT_MEM_REGION.
>    - New: Finalize VM. KVM_TDX_FINALIZE. Complete measurement of the initial
>      TDX VM contents.
>    - VCPU RUN (KVM_VCPU_RUN)
> 
> - Protected guest state
> Because the guest state (CPU state and guest memory) is protected, the KVM VMM
> can't operate on them.  For example, accessing CPU registers, injecting
> exceptions, and accessing guest memory.  Those operations are handled as
> silently ignored, returning zero or initial reset value when it's requested via
> KVM API ioctls.
> 
>      VM/VCPU state and callbacks for TDX specific operations.
>      Define tdx specific VM state and VCPU state instead of VMX ones.  Redirect
>      operations to TDX specific callbacks.  "if (tdx) tdx_op() else vmx_op()".
> 
>      Operations on the CPU state
>      silently ignore operations on the guest state.  For example, the write to
>      CPU registers is ignored and the read from CPU registers returns 0.
> 
>      . ignore access to CPU registers except for allowed ones.
>      . TSC: add a check if tsc is immutable and return an error.  Because the KVM
>        implementation updates the internal tsc state and it's difficult to back
>        out those changes.  Instead, skip the logic.
>      . dirty logging: add check if dirty logging is supported.
>      . exceptions/SMI/MCE/SIPI/INIT: silently ignore
> 
>      Note: virtual external interrupt and NMI can be injected into TDX guests.
> 
> - KVM MMU integration
> One bit of the guest physical address (bit 51 or 47) is repurposed to indicate if
> the guest physical address is private (the bit is cleared) or shared (the bit is
> set).  The bits are called stolen bits.
> 
>    - Stolen bits framework
>      systematically tracks which guest physical address, shared or private, is
>      used.
> 
>    - Shared EPT and secure EPT
>      There are two EPTs. Shared EPT (the conventional one) and Secure
>      EPT(the new one). Shared EPT is handled the same for the stolen
>      bit set.  Secure EPT points to private guest pages.  To resolve
>      EPT violation, KVM walks one of two EPTs based on faulted GPA.
>      Because it's costly to access secure EPT during walking EPTs with
>      SEAMCALLs for the private guest physical address, another private
>      EPT is used as a shadow of Secure-EPT with the existing logic at
>      the cost of extra memory.
> 
> The following depicts the relationship.
> 
>                      KVM                             |       TDX module
>                       |                              |           |
>          -------------+----------                    |           |
>          |                      |                    |           |
>          V                      V                    |           |
>       shared GPA           private GPA               |           |
>    CPU shared EPT pointer  KVM private EPT pointer   |  CPU secure EPT pointer
>          |                      |                    |           |
>          |                      |                    |           |
>          V                      V                    |           V
>    shared EPT                private EPT--------mirror----->Secure EPT
>          |                      |                    |           |
>          |                      \--------------------+------\    |
>          |                                           |      |    |
>          V                                           |      V    V
>    shared guest page                                 |    private guest page
>                                                      |
>                                                      |
>                                non-encrypted memory  |    encrypted memory
>                                                      |
> 
>    - Operating on Secure EPT
>      Use the TDX module APIs to operate on Secure EPT.  To call the TDX API
>      during resolving EPT violation, add hooks to additional operation and wiring
>      it to TDX backend.
> 
> * References
> 
> [1] TDX specification
>     https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html
> [2] Intel Trust Domain Extensions (Intel TDX)
>     https://cdrdv2.intel.com/v1/dl/getContent/726790
> [3] Intel CPU Architectural Extensions Specification
>     https://www.intel.com/content/dam/develop/external/us/en/documents-tps/intel-tdx-cpu-architectural-specification.pdf
> [4] Intel TDX Module 1.0 Specification
>     https://www.intel.com/content/dam/develop/external/us/en/documents/tdx-module-1.0-public-spec-v0.931.pdf
> [5] Intel TDX Loader Interface Specification
>    https://www.intel.com/content/dam/develop/external/us/en/documents-tps/intel-tdx-seamldr-interface-specification.pdf
> [6] Intel TDX Guest-Hypervisor Communication Interface
>     https://cdrdv2.intel.com/v1/dl/getContent/726790
> [7] Intel TDX Virtual Firmware Design Guide
>     https://www.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.01.pdf
> [8] intel public github
>     kvm TDX branch: https://github.com/intel/tdx/tree/kvm
>     TDX guest branch: https://github.com/intel/tdx/tree/guest
>     qemu TDX https://github.com/intel/qemu-tdx
> [9] TDVF
>      https://github.com/tianocore/edk2-staging/tree/TDVF
>      This was merged into EDK2 main branch. https://github.com/tianocore/edk2
> 
> Chao Gao (3):
>    KVM: x86: Move check_processor_compatibility from init ops to runtime
>      ops
>    Partially revert "KVM: Pass kvm_init()'s opaque param to additional
>      arch funcs"
>    KVM: x86: Allow to update cached values in kvm_user_return_msrs w/o
>      wrmsr
> 
> Isaku Yamahata (74):
>    KVM: Refactor CPU compatibility check on module initialiization
>    x86/virt/vmx/tdx: export platform_has_tdx
>    KVM: TDX: Detect CPU feature on kernel module initialization
>    KVM: x86: Refactor KVM VMX module init/exit functions
>    KVM: TDX: Add placeholders for TDX VM/vcpu structure
>    x86/virt/tdx: Add a helper function to return system wide info about
>      TDX module
>    KVM: TDX: Initialize TDX module when loading kvm_intel.ko
>    KVM: TDX: Make TDX VM type supported
>    [MARKER] The start of TDX KVM patch series: TDX architectural
>      definitions
>    KVM: TDX: Define TDX architectural definitions
>    KVM: TDX: Add C wrapper functions for SEAMCALLs to the TDX module
>    KVM: TDX: Add helper functions to print TDX SEAMCALL error
>    [MARKER] The start of TDX KVM patch series: TD VM creation/destruction
>    x86/cpu: Add helper functions to allocate/free TDX private host key id
>    KVM: TDX: Add place holder for TDX VM specific mem_enc_op ioctl
>    KVM: TDX: Make KVM_CAP_SET_IDENTITY_MAP_ADDR unsupported for TDX
>    KVM: TDX: Make pmu_intel.c ignore guest TD case
>    [MARKER] The start of TDX KVM patch series: TD vcpu
>      creation/destruction
>    KVM: TDX: allocate/free TDX vcpu structure
>    KVM: TDX: allocate/free TDX vcpu structure
>    [MARKER] The start of TDX KVM patch series: KVM MMU GPA shared bits
>    KVM: x86/mmu: introduce config for PRIVATE KVM MMU
>    [MARKER] The start of TDX KVM patch series: KVM TDP refactoring for
>      TDX
>    KVM: x86/mmu: Disallow fast page fault on private GPA
>    KVM: VMX: Introduce test mode related to EPT violation VE
>    [MARKER] The start of TDX KVM patch series: KVM TDP MMU hooks
>    KVM: x86/mmu: Focibly use TDP MMU for TDX
>    KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page
>    KVM: x86/tdp_mmu: refactor kvm_tdp_mmu_map()
>    KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU
>    [MARKER] The start of TDX KVM patch series: TDX EPT violation
>    KVM: x86/tdp_mmu: Ignore unsupported mmu operation on private GFNs
>    KVM: TDX: don't request KVM_REQ_APIC_PAGE_RELOAD
>    KVM: TDX: TDP MMU TDX support
>    [MARKER] The start of TDX KVM patch series: KVM TDP MMU MapGPA
>    KVM: x86/mmu: steal software usable git to record if GFN is for shared
>      or not
>    KVM: x86/tdp_mmu: implement MapGPA hypercall for TDX
>    [MARKER] The start of TDX KVM patch series: TD finalization
>    KVM: TDX: Create initial guest memory
>    KVM: TDX: Finalize VM initialization
>    [MARKER] The start of TDX KVM patch series: TD vcpu enter/exit
>    KVM: TDX: Add helper assembly function to TDX vcpu
>    KVM: TDX: Implement TDX vcpu enter/exit path
>    KVM: TDX: vcpu_run: save/restore host state(host kernel gs)
>    KVM: TDX: restore host xsave state when exit from the guest TD
>    KVM: TDX: restore user ret MSRs
>    [MARKER] The start of TDX KVM patch series: TD vcpu
>      exits/interrupts/hypercalls
>    KVM: TDX: complete interrupts after tdexit
>    KVM: TDX: restore debug store when TD exit
>    KVM: TDX: handle vcpu migration over logical processor
>    KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched
>      behavior
>    KVM: TDX: remove use of struct vcpu_vmx from posted_interrupt.c
>    KVM: TDX: Implement interrupt injection
>    KVM: TDX: Implements vcpu request_immediate_exit
>    KVM: TDX: Implement methods to inject NMI
>    KVM: TDX: Add a place holder to handle TDX VM exit
>    KVM: TDX: handle EXIT_REASON_OTHER_SMI
>    KVM: TDX: handle ept violation/misconfig exit
>    KVM: TDX: handle EXCEPTION_NMI and EXTERNAL_INTERRUPT
>    KVM: TDX: Add a place holder for handler of TDX hypercalls
>      (TDG.VP.VMCALL)
>    KVM: TDX: handle KVM hypercall with TDG.VP.VMCALL
>    KVM: TDX: Handle TDX PV CPUID hypercall
>    KVM: TDX: Handle TDX PV HLT hypercall
>    KVM: TDX: Handle TDX PV port io hypercall
>    KVM: TDX: Implement callbacks for MSR operations for TDX
>    KVM: TDX: Handle TDX PV rdmsr/wrmsr hypercall
>    KVM: TDX: Handle TDX PV report fatal error hypercall
>    KVM: TDX: Handle TDX PV map_gpa hypercall
>    KVM: TDX: Handle TDG.VP.VMCALL<GetTdVmCallInfo> hypercall
>    KVM: TDX: Silently discard SMI request
>    KVM: TDX: Silently ignore INIT/SIPI
>    Documentation/virtual/kvm: Document on Trust Domain Extensions(TDX)
>    KVM: x86: design documentation on TDX support of x86 KVM TDP MMU
>    [MARKER] the end of (the first phase of) TDX KVM patch series
> 
> Rick Edgecombe (1):
>    KVM: x86/mmu: Add address conversion functions for TDX shared bits
> 
> Sean Christopherson (25):
>    KVM: VMX: Move out vmx_x86_ops to 'main.c' to wrap VMX and TDX
>    KVM: Enable hardware before doing arch VM initialization
>    KVM: x86: Introduce vm_type to differentiate default VMs from
>      confidential VMs
>    KVM: TDX: Add TDX "architectural" error codes
>    KVM: TDX: Stub in tdx.h with structs, accessors, and VMCS helpers
>    KVM: TDX: create/destroy VM structure
>    KVM: TDX: x86: Add ioctl to get TDX systemwide parameters
>    KVM: TDX: Do TDX specific vcpu initialization
>    KVM: x86/mmu: Explicitly check for MMIO spte in fast page fault
>    KVM: x86/mmu: Allow non-zero value for non-present SPTE
>    KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis
>    KVM: x86/mmu: Allow per-VM override of the TDP max page level
>    KVM: x86/mmu: Zap only leaf SPTEs for deleted/moved memslot for
>      private mmu
>    KVM: x86/mmu: Disallow dirty logging for x86 TDX
>    KVM: VMX: Split out guts of EPT violation to common/exposed function
>    KVM: VMX: Move setting of EPT MMU masks to common VT-x code
>    KVM: TDX: Add load_mmu_pgd method for TDX
>    KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by TDX
>    KVM: TDX: Add support for find pending IRQ in a protected local APIC
>    KVM: x86: Assume timer IRQ was injected if APIC state is proteced
>    KVM: VMX: Modify NMI and INTR handlers to take intr_info as function
>      argument
>    KVM: VMX: Move NMI/exception handler to common helper
>    KVM: x86: Split core of hypercall emulation to helper function
>    KVM: TDX: Handle TDX PV MMIO hypercall
>    KVM: TDX: Add methods to ignore accesses to CPU state
> 
> Xiaoyao Li (1):
>    KVM: TDX: initialize VM with TDX specific parameters
> 
>   Documentation/virt/kvm/api.rst         |   30 +-
>   Documentation/virt/kvm/intel-tdx.rst   |  381 ++++
>   Documentation/virt/kvm/tdx-tdp-mmu.rst |  466 +++++
>   arch/arm64/kvm/arm.c                   |    2 +-
>   arch/mips/kvm/mips.c                   |   14 +-
>   arch/powerpc/kvm/powerpc.c             |    2 +-
>   arch/riscv/kvm/main.c                  |    2 +-
>   arch/s390/kvm/kvm-s390.c               |    2 +-
>   arch/x86/events/intel/ds.c             |    1 +
>   arch/x86/include/asm/kvm-x86-ops.h     |   10 +
>   arch/x86/include/asm/kvm_host.h        |   56 +-
>   arch/x86/include/asm/tdx.h             |   66 +
>   arch/x86/include/asm/vmx.h             |   14 +
>   arch/x86/include/uapi/asm/kvm.h        |   95 +
>   arch/x86/include/uapi/asm/vmx.h        |    5 +-
>   arch/x86/kvm/Kconfig                   |    4 +
>   arch/x86/kvm/Makefile                  |    3 +-
>   arch/x86/kvm/irq.c                     |    3 +
>   arch/x86/kvm/lapic.c                   |   37 +-
>   arch/x86/kvm/lapic.h                   |    2 +
>   arch/x86/kvm/mmu.h                     |   48 +-
>   arch/x86/kvm/mmu/mmu.c                 |  371 +++-
>   arch/x86/kvm/mmu/mmu_internal.h        |  120 ++
>   arch/x86/kvm/mmu/paging_tmpl.h         |    5 +-
>   arch/x86/kvm/mmu/spte.c                |   46 +-
>   arch/x86/kvm/mmu/spte.h                |   65 +-
>   arch/x86/kvm/mmu/tdp_iter.c            |    1 +
>   arch/x86/kvm/mmu/tdp_iter.h            |    5 +-
>   arch/x86/kvm/mmu/tdp_mmu.c             |  683 ++++++-
>   arch/x86/kvm/mmu/tdp_mmu.h             |   12 +-
>   arch/x86/kvm/svm/svm.c                 |   13 +-
>   arch/x86/kvm/vmx/common.h              |  154 ++
>   arch/x86/kvm/vmx/evmcs.c               |    2 +-
>   arch/x86/kvm/vmx/evmcs.h               |    2 +-
>   arch/x86/kvm/vmx/main.c                | 1073 ++++++++++
>   arch/x86/kvm/vmx/pmu_intel.c           |   33 +
>   arch/x86/kvm/vmx/pmu_intel.h           |   29 +
>   arch/x86/kvm/vmx/posted_intr.c         |   43 +-
>   arch/x86/kvm/vmx/posted_intr.h         |   13 +
>   arch/x86/kvm/vmx/tdx.c                 | 2470 ++++++++++++++++++++++++
>   arch/x86/kvm/vmx/tdx.h                 |  275 +++
>   arch/x86/kvm/vmx/tdx_arch.h            |  157 ++
>   arch/x86/kvm/vmx/tdx_errno.h           |   29 +
>   arch/x86/kvm/vmx/tdx_error.c           |   22 +
>   arch/x86/kvm/vmx/tdx_ops.h             |  188 ++
>   arch/x86/kvm/vmx/vmenter.S             |  146 ++
>   arch/x86/kvm/vmx/vmx.c                 |  716 +++----
>   arch/x86/kvm/vmx/vmx.h                 |   41 +-
>   arch/x86/kvm/vmx/x86_ops.h             |  235 +++
>   arch/x86/kvm/x86.c                     |  155 +-
>   arch/x86/virt/vmx/tdx/seamcall.S       |    1 +
>   arch/x86/virt/vmx/tdx/tdx.c            |   53 +-
>   arch/x86/virt/vmx/tdx/tdx.h            |   52 -
>   include/linux/kvm_host.h               |    4 +-
>   include/uapi/linux/kvm.h               |    2 +
>   tools/arch/x86/include/uapi/asm/kvm.h  |   95 +
>   tools/include/uapi/linux/kvm.h         |    1 +
>   virt/kvm/kvm_main.c                    |   67 +-
>   58 files changed, 7839 insertions(+), 783 deletions(-)
>   create mode 100644 Documentation/virt/kvm/intel-tdx.rst
>   create mode 100644 Documentation/virt/kvm/tdx-tdp-mmu.rst
>   create mode 100644 arch/x86/kvm/vmx/common.h
>   create mode 100644 arch/x86/kvm/vmx/main.c
>   create mode 100644 arch/x86/kvm/vmx/pmu_intel.h
>   create mode 100644 arch/x86/kvm/vmx/tdx.c
>   create mode 100644 arch/x86/kvm/vmx/tdx.h
>   create mode 100644 arch/x86/kvm/vmx/tdx_arch.h
>   create mode 100644 arch/x86/kvm/vmx/tdx_errno.h
>   create mode 100644 arch/x86/kvm/vmx/tdx_error.c
>   create mode 100644 arch/x86/kvm/vmx/tdx_ops.h
>   create mode 100644 arch/x86/kvm/vmx/x86_ops.h
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ