lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250113021050.18828-1-yan.y.zhao@intel.com>
Date: Mon, 13 Jan 2025 10:10:50 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: pbonzini@...hat.com,
	seanjc@...gle.com,
	kvm@...r.kernel.org
Cc: linux-kernel@...r.kernel.org,
	rick.p.edgecombe@...el.com,
	kai.huang@...el.com,
	adrian.hunter@...el.com,
	reinette.chatre@...el.com,
	xiaoyao.li@...el.com,
	tony.lindgren@...el.com,
	binbin.wu@...ux.intel.com,
	dmatlack@...gle.com,
	isaku.yamahata@...el.com,
	isaku.yamahata@...il.com
Subject: [PATCH 1/7] KVM: TDX: Return -EBUSY when tdh_mem_page_add() encounters TDX_OPERAND_BUSY

tdh_mem_page_add() is called during TD build time. Within the TDX module,
it acquires the exclusive lock on the TDR resource (eliminating the need to
hold locks for TDCS/SEPT tree) and the exclusive lock on the PAMT entry for
the page to be added. The TDX module returns TDX_OPERAND_BUSY if
tdh_mem_page_add() contends with other SEAMCALLs.

SEAMCALL                Lock Type        Resource
-----------------------------------------------------------------------
tdh_mem_page_add        EXCLUSIVE        TDR
                        NO_LOCK          TDCS
                        NO_LOCK          SEPT tree
                        EXCLUSIVE        PAMT entry for the page to add

Given (1)-(4) and the expected behavior from userspace (5), KVM doesn't
expect tdh_mem_page_add() to encounter TDX_OPERAND_BUSY:
(1) tdx_vcpu_create() only allows vCPU creation when the TD state is
    TD_STATE_INITIALIZED, so tdh_mem_page_add(), as invoked in vCPU ioctl,
    does not contend with
    tdh_mng_create()/tdh_mng_addcx()/tdh_mng_key_config()/tdh_mng_init().

(2) tdx_vcpu_ioctl() bails out on TD_STATE_RUNNABLE, so tdh_mem_page_add()
    does not contend with tdh_vp_enter()/tdh_mem_page_aug()/tdh_mem_track()
    and TDCALLs.

(3) By holding slots_lock and the filemap invalidate lock,
    tdh_mem_page_add() does not contend with tdh_mr_finalize(),
    tdh_mem_page_remove()/tdh_mem_range_block()/
    tdh_phymem_page_wbinvd_hkid() or another tdh_mem_page_add(),
    tdh_mem_sept_add()/tdh_mr_extend().

(4) By holding reference to kvm, tdh_mem_page_add() does not contend with
    tdh_mng_vpflushdone()/tdh_phymem_cache_wb()/tdh_mng_key_freeid()/
    tdh_phymem_page_wbinvd_tdr()/tdh_phymem_page_reclaim().

(5) A well-behaved userspace invokes ioctl KVM_TDX_INIT_MEM_REGION on one
    vCPU after initializing all vCPUs and does not invoke ioctls on the
    other vCPUs before the KVM_TDX_INIT_MEM_REGION completes. Thus,
    tdh_mem_page_add() does not contend with
    tdh_vp_create()/tdh_vp_addcx()/tdh_vp_init*()/tdh_vp_rd()/tdh_vp_wr()/
    tdh_mng_rd()/tdh_vp_flush() on the other vCPUs.

However, if the userspace breaks (5), tdh_mem_page_add() could encounter
TDX_OPERAND_BUSY when trying to acquire the exclusive lock on the TDR
resource in the TDX module. In this case, simply return -EBUSY.

Signed-off-by: Yan Zhao <yan.y.zhao@...el.com>
---
tdx_vcpu_pre_run() will check TD_STATE_RUNNABLE for (2).
https://lore.kernel.org/kvm/3576c721-3ef2-40bd-8764-b50912df93a2@intel.com/
---
 arch/x86/kvm/vmx/tdx.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index d0dc3200fa37..1cf3ef0faff7 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -3024,13 +3024,11 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
 	}
 
 	ret = 0;
-	do {
-		err = tdh_mem_page_add(kvm_tdx->tdr_pa, gpa, pfn_to_hpa(pfn),
-				       pfn_to_hpa(page_to_pfn(page)),
-				       &entry, &level_state);
-	} while (err == TDX_ERROR_SEPT_BUSY);
+	err = tdh_mem_page_add(kvm_tdx->tdr_pa, gpa, pfn_to_hpa(pfn),
+			       pfn_to_hpa(page_to_pfn(page)),
+			       &entry, &level_state);
 	if (err) {
-		ret = -EIO;
+		ret = unlikely(err & TDX_OPERAND_BUSY) ? -EBUSY : -EIO;
 		goto out;
 	}
 
-- 
2.43.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ