linux-kernel - Re: [PATCH 1/7] KVM: TDX: Add a place holder to handle TDX VM exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z1bS0sWkPjsaf33b@intel.com>
Date: Mon, 9 Dec 2024 19:21:54 +0800
From: Chao Gao <chao.gao@...el.com>
To: Binbin Wu <binbin.wu@...ux.intel.com>
CC: <pbonzini@...hat.com>, <seanjc@...gle.com>, <kvm@...r.kernel.org>,
	<rick.p.edgecombe@...el.com>, <kai.huang@...el.com>,
	<adrian.hunter@...el.com>, <reinette.chatre@...el.com>,
	<xiaoyao.li@...el.com>, <tony.lindgren@...ux.intel.com>,
	<isaku.yamahata@...el.com>, <yan.y.zhao@...el.com>, <michael.roth@....com>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/7] KVM: TDX: Add a place holder to handle TDX VM exit

On Sun, Dec 01, 2024 at 11:53:50AM +0800, Binbin Wu wrote:
>From: Isaku Yamahata <isaku.yamahata@...el.com>
>
>Introduce the wiring for handling TDX VM exits by implementing the
>callbacks .get_exit_info(), and .handle_exit().  Additionally, add
>error handling during the TDX VM exit flow, and add a place holder
>to handle various exit reasons.  Add helper functions to retrieve
>exit information, exit qualifications, and more.
>
>Contention Handling: The TDH.VP.ENTER operation may contend with TDH.MEM.*
>operations for secure EPT or TD EPOCH.  If contention occurs, the return
>value will have TDX_OPERAND_BUSY set with operand type, prompting the vCPU
>to attempt re-entry into the guest via the fast path.
>
>Error Handling: The following scenarios will return to userspace with
>KVM_EXIT_INTERNAL_ERROR.
>- TDX_SW_ERROR: This includes #UD caused by SEAMCALL instruction if the
>  CPU isn't in VMX operation, #GP caused by SEAMCALL instruction when TDX
>  isn't enabled by the BIOS, and TDX_SEAMCALL_VMFAILINVALID when SEAM
>  firmware is not loaded or disabled.
>- TDX_ERROR: This indicates some check failed in the TDX module, preventing
>  the vCPU from running.
>- TDX_NON_RECOVERABLE: Set by the TDX module when the error is
>  non-recoverable, indicating that the TDX guest is dead or the vCPU is
>  disabled.  This also covers failed_vmentry case, which must have
>  TDX_NON_RECOVERABLE set since off-TD debug feature has not been enabled.
>  An exception is the triple fault, which also sets TDX_NON_RECOVERABLE
>  but exits to userspace with KVM_EXIT_SHUTDOWN, aligning with the VMX
>  case.
>- Any unhandled VM exit reason will also return to userspace with
>  KVM_EXIT_INTERNAL_ERROR.
>
>Suggested-by: Sean Christopherson <seanjc@...gle.com>
>Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
>Co-developed-by: Binbin Wu <binbin.wu@...ux.intel.com>
>Signed-off-by: Binbin Wu <binbin.wu@...ux.intel.com>

Reviewed-by: Chao Gao <chao.gao@...el.com>

[..]

> fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit)
> {
> 	struct vcpu_tdx *tdx = to_tdx(vcpu);
>@@ -837,9 +900,26 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit)
> 	tdx->prep_switch_state = TDX_PREP_SW_STATE_UNRESTORED;
> 
> 	vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET;
>+
>+	if (unlikely((tdx->vp_enter_ret & TDX_SW_ERROR) == TDX_SW_ERROR))
>+		return EXIT_FASTPATH_NONE;
>+
>+	if (unlikely(tdx_check_exit_reason(vcpu, EXIT_REASON_MCE_DURING_VMENTRY)))
>+		kvm_machine_check();

I was wandering if EXIT_REASON_MCE_DURING_VMENTRY should be handled in the
switch-case in tdx_handle_exit() because I saw there is a dedicated handler
for VMX. But looks EXIT_REASON_MCE_DURING_VMENTRY is a kind of VMentry
failure. So, it won't reach that switch-case. And, VMX's handler for
EXIT_REASON_MCE_DURING_VMENTRY is actually dead code and can be removed.

>+
> 	trace_kvm_exit(vcpu, KVM_ISA_VMX);
> 
>-	return EXIT_FASTPATH_NONE;
>+	if (unlikely(tdx_has_exit_reason(vcpu) && tdexit_exit_reason(vcpu).failed_vmentry))
>+		return EXIT_FASTPATH_NONE;
>+
>+	return tdx_exit_handlers_fastpath(vcpu);
>+}
>+
>+static int tdx_handle_triple_fault(struct kvm_vcpu *vcpu)
>+{
>+	vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
>+	vcpu->mmio_needed = 0;
>+	return 0;
> }
> 
> void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level)
>@@ -1135,6 +1215,88 @@ int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
> 	return tdx_sept_drop_private_spte(kvm, gfn, level, pfn);
> }
> 
>+int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath)
>+{
>+	struct vcpu_tdx *tdx = to_tdx(vcpu);
>+	u64 vp_enter_ret = tdx->vp_enter_ret;
>+	union vmx_exit_reason exit_reason;
>+
>+	if (fastpath != EXIT_FASTPATH_NONE)
>+		return 1;
>+
>+	/*
>+	 * Handle TDX SW errors, including TDX_SEAMCALL_UD, TDX_SEAMCALL_GP and
>+	 * TDX_SEAMCALL_VMFAILINVALID.
>+	 */
>+	if (unlikely((vp_enter_ret & TDX_SW_ERROR) == TDX_SW_ERROR)) {
>+		KVM_BUG_ON(!kvm_rebooting, vcpu->kvm);
>+		goto unhandled_exit;
>+	}
>+
>+	/*
>+	 * Without off-TD debug enabled, failed_vmentry case must have
>+	 * TDX_NON_RECOVERABLE set.
>+	 */
>+	if (unlikely(vp_enter_ret & (TDX_ERROR | TDX_NON_RECOVERABLE))) {
>+		/* Triple fault is non-recoverable. */
>+		if (unlikely(tdx_check_exit_reason(vcpu, EXIT_REASON_TRIPLE_FAULT)))
>+			return tdx_handle_triple_fault(vcpu);
>+
>+		kvm_pr_unimpl("TD vp_enter_ret 0x%llx, hkid 0x%x hkid pa 0x%llx\n",
>+			      vp_enter_ret, to_kvm_tdx(vcpu->kvm)->hkid,
>+			      set_hkid_to_hpa(0, to_kvm_tdx(vcpu->kvm)->hkid));
>+		goto unhandled_exit;
>+	}
>+
>+	/* From now, the seamcall status should be TDX_SUCCESS. */
>+	WARN_ON_ONCE((vp_enter_ret & TDX_SEAMCALL_STATUS_MASK) != TDX_SUCCESS);
>+	exit_reason = tdexit_exit_reason(vcpu);
>+
>+	switch (exit_reason.basic) {
>+	default:
>+		break;
>+	}
>+
>+unhandled_exit:
>+	vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>+	vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
>+	vcpu->run->internal.ndata = 2;
>+	vcpu->run->internal.data[0] = vp_enter_ret;
>+	vcpu->run->internal.data[1] = vcpu->arch.last_vmentry_cpu;
>+	return 0;
>+}
>+
>+void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason,
>+		u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code)
>+{
>+	struct vcpu_tdx *tdx = to_tdx(vcpu);
>+
>+	if (tdx_has_exit_reason(vcpu)) {
>+		/*
>+		 * Encode some useful info from the the 64 bit return code
>+		 * into the 32 bit exit 'reason'. If the VMX exit reason is
>+		 * valid, just set it to those bits.
>+		 */
>+		*reason = (u32)tdx->vp_enter_ret;
>+		*info1 = tdexit_exit_qual(vcpu);
>+		*info2 = tdexit_ext_exit_qual(vcpu);
>+	} else {
>+		/*
>+		 * When the VMX exit reason in vp_enter_ret is not valid,
>+		 * overload the VMX_EXIT_REASONS_FAILED_VMENTRY bit (31) to
>+		 * mean the vmexit code is not valid. Set the other bits to
>+		 * try to avoid picking a value that may someday be a valid
>+		 * VMX exit code.
>+		 */
>+		*reason = 0xFFFFFFFF;
>+		*info1 = 0;
>+		*info2 = 0;
>+	}
>+
>+	*intr_info = tdexit_intr_info(vcpu);

If there is no valid exit reason, shouldn't intr_info be set to 0?

>+	*error_code = 0;
>+}
>+