[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aO1oKWbjeswQ-wZO@google.com>
Date: Mon, 13 Oct 2025 13:59:21 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Rick P Edgecombe <rick.p.edgecombe@...el.com>
Cc: "x86@...nel.org" <x86@...nel.org>, "kas@...nel.org" <kas@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "mingo@...hat.com" <mingo@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>, "bp@...en8.de" <bp@...en8.de>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, Chao Gao <chao.gao@...el.com>,
Kai Huang <kai.huang@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Dan J Williams <dan.j.williams@...el.com>,
Adrian Hunter <adrian.hunter@...el.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>, "xin@...or.com" <xin@...or.com>
Subject: Re: [RFC PATCH 3/4] KVM: x86/tdx: Do VMXON and TDX-Module
initialization during tdx_init()
On Mon, Oct 13, 2025, Rick P Edgecombe wrote:
> On Fri, 2025-10-10 at 15:04 -0700, Sean Christopherson wrote:
> > @@ -3524,34 +3453,31 @@ static int __init __tdx_bringup(void)
> > if (td_conf->max_vcpus_per_td < num_present_cpus()) {
> > pr_err("Disable TDX: MAX_VCPU_PER_TD (%u) smaller than number of logical CPUs (%u).\n",
> > td_conf->max_vcpus_per_td, num_present_cpus());
> > - goto get_sysinfo_err;
> > + return -EINVAL;
> > }
> >
> > if (misc_cg_set_capacity(MISC_CG_RES_TDX, tdx_get_nr_guest_keyids()))
> > - goto get_sysinfo_err;
> > + return -EINVAL;
> >
> > /*
> > - * Leave hardware virtualization enabled after TDX is enabled
> > - * successfully. TDX CPU hotplug depends on this.
> > + * TDX-specific cpuhp callback to disallow offlining the last CPU in a
> > + * packing while KVM is running one or more TDs. Reclaiming HKIDs
> > + * requires doing PAGE.WBINVD on every package, i.e. offlining all CPUs
> > + * of a package would prevent reclaiming the HKID.
> > */
> > + r = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "kvm/cpu/tdx:online",
> > + tdx_online_cpu, tdx_offline_cpu);
>
> Could pass NULL instead of tdx_online_cpu() and delete this version of
> tdx_online_cpu().
Oh, nice, I didn't realize (or forgot) the startup call is optional.
> Also could remove the error handling too.
No. Partly on prinicple, but also because CPUHP_AP_ONLINE_DYN can fail if the
kernel runs out of dynamic entries (currently limited to 40). The kernel WARNs
if it runs out of entries, but KVM should still do the right thing.
> Also, can we name the two tdx_offline_cpu()'s differently? This one is all about
> keyid's being in use. tdx_hkid_offline_cpu()?
Ya. And change the description to "kvm/cpu/tdx:hkid_packages"? Or something
like that.
> > + if (r < 0)
> > + goto err_cpuhup;
> > +
> > + tdx_cpuhp_state = r;
> > return 0;
> >
> > -get_sysinfo_err:
> > - __tdx_cleanup();
> > -tdx_bringup_err:
> > - kvm_disable_virtualization();
> > +err_cpuhup:
> > + misc_cg_set_capacity(MISC_CG_RES_TDX, 0);
> > return r;
> > }
> >
> > -void tdx_cleanup(void)
> > -{
> > - if (enable_tdx) {
> > - misc_cg_set_capacity(MISC_CG_RES_TDX, 0);
> > - __tdx_cleanup();
> > - kvm_disable_virtualization();
> > - }
> > -}
> > -
> > int __init tdx_bringup(void)
> > {
> > int r, i;
> > @@ -3563,6 +3489,16 @@ int __init tdx_bringup(void)
> > if (!enable_tdx)
> > return 0;
> >
> > + if (!cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM)) {
> > + pr_err("TDX not supported by the host platform\n");
> > + goto success_disable_tdx;
> > + }
> > +
> > + if (!cpu_feature_enabled(X86_FEATURE_OSXSAVE)) {
> > + pr_err("OSXSAVE is required for TDX\n");
> > + return -EINVAL;
>
> Why change this condition from goto success_disable_tdx?
Ah, a copy+paste goof. I originally moved the code to tdx_enable(), then realized
it as checking OSXSAVE, not XSAVE, and so needed to be done later in boot. When
I copied it back to KVM, I forgot to restore the goto.
> > r = __tdx_bringup();
> > - if (r) {
> > - /*
> > - * Disable TDX only but don't fail to load module if the TDX
> > - * module could not be loaded. No need to print message saying
> > - * "module is not loaded" because it was printed when the first
> > - * SEAMCALL failed. Don't bother unwinding the S-EPT hooks or
> > - * vm_size, as kvm_x86_ops have already been finalized (and are
> > - * intentionally not exported). The S-EPT code is unreachable,
> > - * and allocating a few more bytes per VM in a should-be-rare
> > - * failure scenario is a non-issue.
> > - */
> > - if (r == -ENODEV)
> > - goto success_disable_tdx;
> > -
> > + if (r)
> > enable_tdx = 0;
> > - }
> >
>
> I think the previous discussion was that there should be a probe and enable
> step. We could not fail KVM init if TDX is not supported (probe), but not try to
> cleanly handle any other unexpected error (fail enabled).
>
> The existing code mostly has the "probe" type checks in tdx_bringup(), and the
> "enable" type checks in __tdx_bringup(). But now the gutted __tdx_bringup() is
> very probe-y. Ideally we could separate these into named "install" and "probe"
> functions. I don't know if this would be good to do this as part of this series
> or later though.
>
> > return r;
> >
> >
> >
>
> [snip]
>
> >
> > /*
> > * Add a memory region as a TDX memory block. The caller must make sure
> > * all memory regions are added in address ascending order and don't
> > * overlap.
> > */
> > -static int add_tdx_memblock(struct list_head *tmb_list, unsigned long start_pfn,
> > - unsigned long end_pfn, int nid)
> > +static __init int add_tdx_memblock(struct list_head *tmb_list,
> > + unsigned long start_pfn,
> > + unsigned long end_pfn, int nid)
>
> One easy thing to break this up would be to do this __init adjustments in a
> follow on patch.
Ya, for sure.
Powered by blists - more mailing lists