[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68914d8f61c20_55f0910074@dwillia2-xfh.jf.intel.com.notmuch>
Date: Mon, 4 Aug 2025 17:17:19 -0700
From: <dan.j.williams@...el.com>
To: Xu Yilun <yilun.xu@...ux.intel.com>, <dan.j.williams@...el.com>
CC: Chao Gao <chao.gao@...el.com>, <linux-coco@...ts.linux.dev>,
<x86@...nel.org>, <kvm@...r.kernel.org>, <seanjc@...gle.com>,
<pbonzini@...hat.com>, <eddie.dong@...el.com>, <kirill.shutemov@...el.com>,
<dave.hansen@...el.com>, <kai.huang@...el.com>, <isaku.yamahata@...el.com>,
<elena.reshetova@...el.com>, <rick.p.edgecombe@...el.com>, Farrah Chen
<farrah.chen@...el.com>, "Kirill A. Shutemov"
<kirill.shutemov@...ux.intel.com>, Dave Hansen <dave.hansen@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 07/20] x86/virt/tdx: Expose SEAMLDR information via
sysfs
Xu Yilun wrote:
> > > > - Create drivers/virt/coco/tdx-tsm/bus.c for registering the tdx_subsys.
> > > > The tdx_subsys has sysfs attributes like "version" (host and guest
> > > > need this, but have different calls to get at the information) and
> > > > "firmware" (only host needs that). So the common code will take sysfs
> > > > groups passed as a parameter.
> > > >
> > > > - The "tdx_tsm" device which is unused in this patch set can be
> > >
> > > It is used in this patch, Chao creates tdx module 'version' attr on this
> > > device. But I assume you have different opinion: tdx_subsys represents
> > > the whole tdx_module and should have the 'version', and tdx_tsm is a
> > > sub device dedicate for TDX Connect, is it?
> >
> > The main reason for a tdx_tsm device in addition to the subsys is to
> > allow for deferred attachment.
>
> I've found another reason, to dynamic control tdx tsm's lifecycle.
> tdx_tsm driver uses seamcalls so its functionality relies on tdx module
> initialization & vmxon. The former is a one way path but vmxon can be
> dynamic off by KVM. vmxoff is fatal to tdx_tsm driver especially on some
> can-not-fail destroy path.
>
> So my idea is to remove tdx_tsm device (thus disables tdx_tsm driver) on
> vmxoff.
>
> KVM TDX core TDX TSM driver
> -----------------------------------------------------
> tdx_disable()
> tdx_tsm dev del
> driver.remove()
> vmxoff()
>
> An alternative is to move vmxon/off management out of KVM, that requires
> a lot of complex work IMHO, Chao & I both prefer not to touch it.
It is fine to require that vmxon/off management remain within KVM, and
tie the lifetime of the device to the lifetime of the kvm_intel module*.
However, I think it is too violent to add/remove the device on async
vmxon/vmxoff.
Are there more sources of async vmxoff besides CPU offline, system
suspend, or system shutdown?
The suspend and shutdown cases can be handled with suspend and shutdown
callbacks in the tdx_tsm driver. Those will be called before KVM's
vmxoff. For CPU offline, is it safe to assume that the driver will not
be invoked from those CPUs?
Are there other sources of vmxoff?
> That said, we still want to "deal with bus/driver binding logic" so faux
> is not a good fit.
Faux device gives you a bus / driver-binding flow, it just expects that
the driver is always ready to bind immediately upon device create.
> > Now, that said, the faux_device infrastructure has arrived since this
> > all started and *could* replace tdx_subsys. The only concern is whether
> > the tdx_tsm driver ever needs to do probe deferral to wait for IOMMU or
> > PCI initialization to happen first.
>
> The tdx_tsm driver needs to wait for IOMMU/PCI initialization...
Intel IOMMU can not be modular and arrives at rootfs_initcall(). PCI
arrives at subsys_initcall(). The earliest that KVM arrives is
late_initcall() when it is built-in.
Hmm, so faux_device could work, all dependencies are resolved before the
device is created.
> > If probe deferral is needed that requires a bus, if probe can always be
> > synchronous with TDX module init then faux_device could work.
>
> ... but doesn't see need for TDX Module early init now. Again TDX Module
> init requires vmxon, so it can't be earlier than KVM init, nor the
> IOMMU/PCI init. So probe synchronous with TDX module init should be OK.
>
> But considering the tdx tsm's lifecycle concern, I still don't prefer
> faux.
If there are other sources of async vmxoff that are not handled by
'suspend' and 'shutdown' handlers in the tdx_tsm driver, then perhaps a
flag that gets toggled to fail requests. Otherwise it feels like the
tdx_tsm device should only end life at vt_exit() / tdx_cleanup().
> Thanks,
> Yilun
* It would be unfortunate if userspace needed to manually probe for TDX
Connect when KVM is not built-in. We might add a simple module that
requests kvm_intel in that case:
static const struct x86_cpu_id tdx_connect_autoprobe_ids[] = {
X86_MATCH_FEATURE(X86_FEATURE_TDX_HOST_PLATFORM, NULL),
{}
};
MODULE_DEVICE_TABLE(x86cpu, tdx_connect_autoprobe_ids);
...to allow for userspace to have dependencies on TDX Connect services
arriving automatically without needing to manually demand load
kvm_intel. That module would just immediately exit if TDX Connect
capability is not found.
Powered by blists - more mailing lists