lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 26 Apr 2024 09:44:32 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "Gao, Chao" <chao.gao@...el.com>
CC: "Zhang, Tina" <tina.zhang@...el.com>, "isaku.yamahata@...ux.intel.com"
	<isaku.yamahata@...ux.intel.com>, "Yuan, Hang" <hang.yuan@...el.com>,
	"seanjc@...gle.com" <seanjc@...gle.com>, "Chen, Bo2" <chen.bo@...el.com>,
	"sagis@...gle.com" <sagis@...gle.com>, "isaku.yamahata@...il.com"
	<isaku.yamahata@...il.com>, "Aktas, Erdem" <erdemaktas@...gle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "Yamahata, Isaku"
	<isaku.yamahata@...el.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>
Subject: Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when
 loading the KVM intel kernel module

On Fri, 2024-04-26 at 11:21 +0800, Gao, Chao wrote:
> On Fri, Apr 26, 2024 at 12:21:46AM +0000, Huang, Kai wrote:
> > 
> > > 
> > > > > The important thing is that they're handled by _one_ entity.  What we have today
> > > > > is probably the worst setup; VMXON is handled by KVM, but TDX.SYS.LP.INIT is
> > > > > handled by core kernel (sort of).
> > > > 
> > > > I cannot argue against this :-)
> > > > 
> > > > But from this point of view, I cannot see difference between tdx_enable()
> > > > and tdx_cpu_enable(), because they both in core-kernel while depend on KVM
> > > > to handle VMXON.
> > > 
> > > My comments were made under the assumption that the code was NOT buggy, i.e. if
> > > KVM did NOT need to call tdx_cpu_enable() independent of tdx_enable().
> > > 
> > > That said, I do think it makes to have tdx_enable() call an private/inner version,
> > > e.g. __tdx_cpu_enable(), and then have KVM call a public version.  Alternatively,
> > > the kernel could register yet another cpuhp hook that runs after KVM's, i.e. does
> > > TDX.SYS.LP.INIT after KVM has done VMXON (if TDX has been enabled).
> > 
> > We will need to handle tdx_cpu_online() in "some cpuhp callback" anyway,
> > no matter whether tdx_enable() calls __tdx_cpu_enable() internally or not,
> > because now tdx_enable() can be done on a subset of cpus that the platform
> > has.
> 
> Can you confirm this is allowed again? it seems like this code indicates the
> opposite:
> 
> https://github.com/intel/tdx-module/blob/tdx_1.5/src/vmm_dispatcher/api_calls/tdh_sys_config.c#L768C1-L775C6

This feature requires ucode/P-SEAMLDR and TDX module change, and cannot be
supported for some *early* generations.  I think they haven't added such
code to the opensource TDX module code yet.

I can ask TDX module people's plan if it is a concern.

In reality, this shouldn't be a problem because the current code kinda
works with both cases:

1) If this feature is not supported (i.e., old platform and/or old
module), and if user tries to enable TDX when there's offline cpu, then
tdx_enable() will fail when it does TDH.SYS.CONFIG, and we can use the
error code to pinpoint the root cause.

2) Otherwise, it just works.

> 
> > 
> > For the latter (after the "Alternatively" above), by "the kernel" do you
> > mean the core-kernel but not KVM?
> > 
> > E.g., you mean to register a cpuhp book _inside_ tdx_enable() after TDX is
> > initialized successfully?
> > 
> > That would have problem like when KVM is not present (e.g., KVM is
> > unloaded after it enables TDX), the cpuhp book won't work at all.
> 
> Is "the cpuhp hook doesn't work if KVM is not loaded" a real problem?
> 
> The CPU about to online won't run any TDX code. So, it should be ok to
> skip tdx_cpu_enable().

It _can_ work if we only consider KVM, because for KVM we can always
guarantee:

1) VMXON + tdx_cpu_enable() have been done for all online cpus before it
calls tdx_enable().
2) VMXON + tdx_cpu_enable() have been done in cpuhp for any new CPU before
it goes online.

Btw, this reminds me why I didn't want to do tdx_cpu_enable() inside
tdx_enable():

tdx_enable() will need to _always_ call tdx_cpu_enable() for all online
cpus regardless of whether the module has been initialized successfully in
the previous calls.

I believed this is kinda silly, i.e., why not just letting the caller to
do tdx_cpu_enable() for all online cpus before tdx_enable().

However, back to the TDX-specific core-kernel cpuhp hook, in the long
term, I believe the TDX cpuhp hook should be put _BEFORE_ all in-kernel
TDX-users' cpuhp hooks, because logically TDX users should depend on TDX
core-kernel code, but not the opposite.

That is, my long term vision is we can have a simple rule:

The core-kernel TDX code always guarantees online CPUs are TDX-capable. 
All TDX users don't need to consider tdx_cpu_enable() ever.  They just
need to call tdx_enable() to bring TDX to work.

So for now, given we depend on KVM for VMXON anyway, I don't see any
reason the core-kernel should register any TDX cpuhp.  Having to "skip
tdx_cpu_enable() when VMX isn't enabled" is kinda hacky anyway.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ