[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZisdtTJoqe7yxnvI@chao-email>
Date: Fri, 26 Apr 2024 11:21:25 +0800
From: Chao Gao <chao.gao@...el.com>
To: "Huang, Kai" <kai.huang@...el.com>
CC: "seanjc@...gle.com" <seanjc@...gle.com>, "Zhang, Tina"
<tina.zhang@...el.com>, "Yuan, Hang" <hang.yuan@...el.com>, "Chen, Bo2"
<chen.bo@...el.com>, "sagis@...gle.com" <sagis@...gle.com>,
"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Aktas, Erdem"
<erdemaktas@...gle.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, "Yamahata, Isaku"
<isaku.yamahata@...el.com>, "isaku.yamahata@...ux.intel.com"
<isaku.yamahata@...ux.intel.com>
Subject: Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when
loading the KVM intel kernel module
On Fri, Apr 26, 2024 at 12:21:46AM +0000, Huang, Kai wrote:
>
>>
>> > > The important thing is that they're handled by _one_ entity. What we have today
>> > > is probably the worst setup; VMXON is handled by KVM, but TDX.SYS.LP.INIT is
>> > > handled by core kernel (sort of).
>> >
>> > I cannot argue against this :-)
>> >
>> > But from this point of view, I cannot see difference between tdx_enable()
>> > and tdx_cpu_enable(), because they both in core-kernel while depend on KVM
>> > to handle VMXON.
>>
>> My comments were made under the assumption that the code was NOT buggy, i.e. if
>> KVM did NOT need to call tdx_cpu_enable() independent of tdx_enable().
>>
>> That said, I do think it makes to have tdx_enable() call an private/inner version,
>> e.g. __tdx_cpu_enable(), and then have KVM call a public version. Alternatively,
>> the kernel could register yet another cpuhp hook that runs after KVM's, i.e. does
>> TDX.SYS.LP.INIT after KVM has done VMXON (if TDX has been enabled).
>
>We will need to handle tdx_cpu_online() in "some cpuhp callback" anyway,
>no matter whether tdx_enable() calls __tdx_cpu_enable() internally or not,
>because now tdx_enable() can be done on a subset of cpus that the platform
>has.
Can you confirm this is allowed again? it seems like this code indicates the
opposite:
https://github.com/intel/tdx-module/blob/tdx_1.5/src/vmm_dispatcher/api_calls/tdh_sys_config.c#L768C1-L775C6
>
>For the latter (after the "Alternatively" above), by "the kernel" do you
>mean the core-kernel but not KVM?
>
>E.g., you mean to register a cpuhp book _inside_ tdx_enable() after TDX is
>initialized successfully?
>
>That would have problem like when KVM is not present (e.g., KVM is
>unloaded after it enables TDX), the cpuhp book won't work at all.
Is "the cpuhp hook doesn't work if KVM is not loaded" a real problem?
The CPU about to online won't run any TDX code. So, it should be ok to
skip tdx_cpu_enable().
Don't get me wrong. I don't object to registering the cpuhp hook in KVM.
I just want you to make decisions based on good information.
>
>If we ever want a new TDX-specific cpuhp hook "at this stage", IMHO it's
>better to have it done by KVM, i.e., it goes away when KVM is unloaded.
>
>Logically, we have two approaches in terms of how to treat
>tdx_cpu_enable():
>
>1) We treat the two cases separately: calling tdx_cpu_enable() for all
>online cpus, and calling it when a new CPU tries to go online in some
>cpuhp hook. And we only want to call tdx_cpu_enable() in cpuhp book when
>tdx_enable() has done successfully.
>
>That is:
>
>a) we always call tdx_cpu_enable() (or __tdx_cpu_enable()) inside
>tdx_enable() as the first step, or,
>
>b) let the caller (KVM) to make sure of tdx_cpu_enable() has been done for
>all online cpus before calling tdx_enable().
>
>Something like this:
>
> if (enable_tdx) {
> cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, kvm_online_cpu,
> ...);
>
> cpus_read_lock();
> on_each_cpu(tdx_cpu_enable, ...); /* or do it inside
> * in tdx_enable() */
> enable_tdx = tdx_enable();
> if (enable_tdx)
> cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> tdx_online_cpu, ...);
> cpus_read_unlock();
> }
>
> static int tdx_online_cpu(unsigned int cpu)
> {
> unsigned long flags;
> int ret;
>
> if (!enable_tdx)
> return 0;
>
> local_irq_save(flags);
> ret = tdx_cpu_enable();
> local_irq_restore(flags);
>
> return ret;
> }
>
>2) We treat tdx_cpu_enable() as a whole by viewing it as the first step to
>run any TDX code (SEAMCALL) on any cpu, including the SEAMCALLs involved
>in tdx_enable().
>
>That is, we *unconditionally* call tdx_cpu_enable() for all online cpus,
>and when a new CPU tries to go online.
>
>This can be handled at once if we do tdx_cpu_enable() inside KVM's cpuhp
>hook:
>
> static int vt_hardware_enable(unsigned int cpu)
> {
> vmx_hardware_enable();
>
> local_irq_save(flags);
> ret = tdx_cpu_enable();
> local_irq_restore(flags);
>
> /*
> * -ENODEV means TDX is not supported by the platform
> * (TDX not enabled by the hardware or module is
> * not loaded) or the kernel isn't built with TDX.
> *
> * Allow CPU to go online as there's no way kernel
> * could use TDX in this case.
> *
> * Other error codes means TDX is available but something
> * went wrong. Prevent this CPU to go online so that
> * TDX may still work on other online CPUs.
> */
> if (ret && ret != -ENODEV)
> return ret;
>
> return ret;
> }
>
>So with your change to always enable virtualization when TDX is enabled
>during module load, we can simply have:
>
> if (enable_tdx)
> cpuhp_setup_state(CPUHP_AP_KVM_ONLINE, kvm_online_cpu,
> ...);
>
> cpus_read_lock();
> enable_tdx = tdx_enable();
> cpus_read_unlock();
> }
>
>So despite the cpus_read_lock() around tdx_enable() is a little bit silly,
>the logic is actually simpler IMHO.
>
>(local_irq_save()/restore() around tdx_cpu_enable() is also silly but that
>is a common problem to both above solution and can be changed
>independently).
>
>Also, as I mentioned that the final goal is to have a TDX-specific CPUHP
>hook in the core-kernel _BEFORE_ any in-kernel TDX user (KVM) to make sure
>all online CPUs are TDX-capable.
>
>When that happens, I can just move the code in vt_hardware_enable() to
>tdx_online_cpu() and do additional VMXOFF inside it, with the assumption
>that the in-kernel TDX users should manage VMXON/VMXOFF on their own.
>Then all TDX users can remove the handling of tdx_cpu_enable().
Powered by blists - more mailing lists