[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260106102440.25328-1-yan.y.zhao@intel.com>
Date: Tue, 6 Jan 2026 18:24:40 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: pbonzini@...hat.com,
seanjc@...gle.com
Cc: linux-kernel@...r.kernel.org,
kvm@...r.kernel.org,
x86@...nel.org,
rick.p.edgecombe@...el.com,
dave.hansen@...el.com,
kas@...nel.org,
tabba@...gle.com,
ackerleytng@...gle.com,
michael.roth@....com,
david@...nel.org,
vannapurve@...gle.com,
sagis@...gle.com,
vbabka@...e.cz,
thomas.lendacky@....com,
nik.borisov@...e.com,
pgonda@...gle.com,
fan.du@...el.com,
jun.miao@...el.com,
francescolavra.fl@...il.com,
jgross@...e.com,
ira.weiny@...el.com,
isaku.yamahata@...el.com,
xiaoyao.li@...el.com,
kai.huang@...el.com,
binbin.wu@...ux.intel.com,
chao.p.peng@...el.com,
chao.gao@...el.com,
yan.y.zhao@...el.com
Subject: [PATCH v3 24/24] KVM: TDX: Turn on PG_LEVEL_2M
Turn on PG_LEVEL_2M in tdx_gmem_private_max_mapping_level() when TDX huge
page is enabled and TD is RUNNABLE.
Introduce a module parameter named "tdx_huge_page" for kvm-intel.ko to
enable/disable TDX huge page. Turn TDX huge page off if the TDX module does
not support TDX_FEATURES0.ENHANCED_DEMOTE_INTERRUPTIBILITY.
Force page size to 4KB during TD build time to simplify code design, since
- tdh_mem_page_add() only adds private pages at 4KB.
- The amount of initial memory pages is usually limited (e.g. ~4MB in a
typical linux TD).
Update the warnings and KVM_BUG_ON() info to match the conditions when 2MB
mappings are permitted.
Signed-off-by: Xiaoyao Li <xiaoyao.li@...el.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
Signed-off-by: Yan Zhao <yan.y.zhao@...el.com>
---
v3:
- Introduce the module param enable_tdx_huge_page and disable to toggle TDX
huge page support.
- Disable TDX huge page if TDX module does not support
TDX_FEATURES0_ENHANCE_DEMOTE_INTERRUPTIBILITY. (Kai).
- Explain why not allow 2M before TD is RUNNABLE in patch log.(Kai)
- Add comment to explain the relationship between returning PG_LEVEL_2M
and guest accept level. (Kai)
- Dropped some KVM_BUG_ON()s due to rebasing. Updated KVM_BUG_ON()s on
mapping levels to take into account of enable_tdx_huge_page.
RFC v2:
- Merged RFC v1's patch 4 (forcing PG_LEVEL_4K before TD runnable) with
patch 9 (allowing PG_LEVEL_2M after TD runnable).
---
arch/x86/kvm/vmx/tdx.c | 45 ++++++++++++++++++++++++++++++++++++------
1 file changed, 39 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 0054a9de867c..8149e89b5549 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -54,6 +54,8 @@
bool enable_tdx __ro_after_init;
module_param_named(tdx, enable_tdx, bool, 0444);
+static bool __read_mostly enable_tdx_huge_page = true;
+module_param_named(tdx_huge_page, enable_tdx_huge_page, bool, 0444);
#define TDX_SHARED_BIT_PWL_5 gpa_to_gfn(BIT_ULL(51))
#define TDX_SHARED_BIT_PWL_4 gpa_to_gfn(BIT_ULL(47))
@@ -1773,8 +1775,12 @@ static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn,
if (KVM_BUG_ON(!vcpu, kvm))
return -EINVAL;
- /* TODO: handle large pages. */
- if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm))
+ /*
+ * Large page is not supported before TD runnable or TDX huge page is
+ * not enabled.
+ */
+ if (KVM_BUG_ON(((!enable_tdx_huge_page || kvm_tdx->state != TD_STATE_RUNNABLE) &&
+ level != PG_LEVEL_4K), kvm))
return -EIO;
WARN_ON_ONCE(!is_shadow_present_pte(mirror_spte) ||
@@ -1937,9 +1943,12 @@ static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
*/
if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm))
return;
-
- /* TODO: handle large pages. */
- if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm))
+ /*
+ * Large page is not supported before TD runnable or TDX huge page is
+ * not enabled.
+ */
+ if (KVM_BUG_ON(((!enable_tdx_huge_page || kvm_tdx->state != TD_STATE_RUNNABLE) &&
+ level != PG_LEVEL_4K), kvm))
return;
err = tdh_do_no_vcpus(tdh_mem_range_block, kvm, &kvm_tdx->td, gpa,
@@ -3556,12 +3565,34 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
return ret;
}
+/*
+ * For private pages:
+ *
+ * Force KVM to map at 4KB level when !enable_tdx_huge_page (e.g., due to
+ * incompatible TDX module) or before TD state is RUNNABLE.
+ *
+ * Always allow KVM to map at 2MB level in other cases, though KVM may still map
+ * the page at 4KB (i.e., passing in PG_LEVEL_4K to AUG) due to
+ * (1) the backend folio is 4KB,
+ * (2) disallow_lpage restrictions:
+ * - mixed private/shared pages in the 2MB range
+ * - level misalignment due to slot base_gfn, slot size, and ugfn
+ * - guest_inhibit bit set due to guest's 4KB accept level
+ * (3) page merging is disallowed (e.g., when part of a 2MB range has been
+ * mapped at 4KB level during TD build time).
+ */
int tdx_gmem_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, bool is_private)
{
if (!is_private)
return 0;
- return PG_LEVEL_4K;
+ if (!enable_tdx_huge_page)
+ return PG_LEVEL_4K;
+
+ if (unlikely(to_kvm_tdx(kvm)->state != TD_STATE_RUNNABLE))
+ return PG_LEVEL_4K;
+
+ return PG_LEVEL_2M;
}
static int tdx_online_cpu(unsigned int cpu)
@@ -3747,6 +3778,8 @@ static int __init __tdx_bringup(void)
if (misc_cg_set_capacity(MISC_CG_RES_TDX, tdx_get_nr_guest_keyids()))
goto get_sysinfo_err;
+ if (enable_tdx_huge_page && !tdx_supports_demote_nointerrupt(tdx_sysinfo))
+ enable_tdx_huge_page = false;
/*
* Leave hardware virtualization enabled after TDX is enabled
* successfully. TDX CPU hotplug depends on this.
--
2.43.2
Powered by blists - more mailing lists