[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <912df0c6bd8f42d92ccc11d9fdda1e108576a5e5.camel@intel.com>
Date: Mon, 17 Mar 2025 01:19:57 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "tglx@...utronix.de" <tglx@...utronix.de>, "peterz@...radead.org"
<peterz@...radead.org>, "mingo@...hat.com" <mingo@...hat.com>, "Hansen, Dave"
<dave.hansen@...el.com>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"bp@...en8.de" <bp@...en8.de>, "kirill.shutemov@...ux.intel.com"
<kirill.shutemov@...ux.intel.com>
CC: "dwmw@...zon.co.uk" <dwmw@...zon.co.uk>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "seanjc@...gle.com" <seanjc@...gle.com>,
"x86@...nel.org" <x86@...nel.org>, "sagis@...gle.com" <sagis@...gle.com>,
"hpa@...or.com" <hpa@...or.com>, "Chatre, Reinette"
<reinette.chatre@...el.com>, "Williams, Dan J" <dan.j.williams@...el.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>, "bhe@...hat.com"
<bhe@...hat.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"nik.borisov@...e.com" <nik.borisov@...e.com>, "ashish.kalra@....com"
<ashish.kalra@....com>, "Yamahata, Isaku" <isaku.yamahata@...el.com>
Subject: Re: [RFC PATCH 3/5] x86/kexec: Disable kexec/kdump on platforms with
TDX partial write erratum
On Fri, 2025-03-14 at 19:03 +0000, Edgecombe, Rick P wrote:
> On Thu, 2025-03-13 at 23:57 +0000, Huang, Kai wrote:
> > > So this will switch all of TDX to be default off then, unless the kernel
> > > gets a
> > > parameter set.
> > >
> >
> > Currently in KVM TDX is also default off.
>
> Good point. It begs the question of how many command line options the user
> should have to pass to enable TDX.
>
> >
> > > In which case we could also just unlock the Kconfig with just one
> > > small change. TDX and kexec would still mutually exclusive, but just at
> > > runtime.
> >
> > Yeah I am thinking this too, given the "keyID 0 integrity" thing are still on-
> > going.
>
> You mentioned offline that there used to be a command line option, but it was
> removed after discussion with Dave. I went to look for it and only found this:
> https://lore.kernel.org/lkml/7e63912a-895f-d3b3-3173-336beaa86d08@intel.com/
>
> ...where Dave just asks why it's needed. In the next version it's dropped.
> Unless there is anything more, it doesn't seem like there was really any
> objection.
Thanks for digging. :-)
I couldn't find any solid reason to argue against Dave so I just dropped it. I
could argue that "this allows people to disable TDX once for all" but it was not
something mandatory at that time.
>
> >
> > > We should try to flag Paolo and see what he thinks.
> >
> > I appreciate if you could help to do.
> >
> > >
> > > Or is the proposal to only be default tdx_host=off on the errata platforms?
> > > And
> > > tdx_host=on otherwise?
> >
> > The tricky thing is, naturally, we want to skip all the code in tdx_init() if
> > tdx_host=off, because there's no reason to do those detection/initialization
> > if
> > we are not going to use TDX, e.g., we don't need to this one:
> >
> > register_memory_notifier(&tdx_memory_nb);
> >
> > .. that means the code of detecting erratum will be skipped too.
> >
> > If we only to only make tdx_host=off as default for erratum platforms, then we
> > need to do cleanup (e.g., to unregister the above memory notifier).
>
> This is a strange point. The errata detection is not dependent on the earlier
> code in TDX init. It couldn't just be moved?
Sorry I don't quite follow your point, but seems you agreed it's not a good
idea.
>
> >
> > This isn't nice and seems hacky.
> >
> > I don't see making tdx_host=off as default has problem, anyway, as mentioned
> > above TDX is off by default in KVM.
>
> Yea, tdx_host=!errata as a default value makes it more complicated.
Yes.
>
>
> So I think the situation is we need at one kernel parameter. We already have one
> for KVM, which controls the late initialization parts of TDX that we care about
> here. So what about just using the existing one? I think we don't want two.
Logically, KVM is one user of TDX. I think whether KVM has a parameter should
not impact whether we should introduce one kernel parameter for TDX host core-
kernel.
Dan also made a point that in the context of TDX Connect, there's requirement to
make SEAMCALLs even KVM is not going to run any TDX guest:
https://lore.kernel.org/kvm/cover.1730120881.git.kai.huang@intel.com/T/#m6928f5519de25def97d47fc6bbb77f5c3e958f7b
So I agree ideally we don't want two, but I think it is also OK if there's good
reason to do so.
>
> If KVM has not initialized TDX (based on its own TDX parameter), then kexec is
> fine.
>
For now. In the future TDX module could be initialized by other kernel
components.
> It could work by exposing an interface for features to be exclusive with
> TDX. Since real TDX module initialization happens late anyway. I don't know if
> it's better than a kernel one, but I don't see adding a second one going well.
>
>
> Very, very rough:
>
> diff --git a/arch/x86/kernel/machine_kexec_64.c
> b/arch/x86/kernel/machine_kexec_64.c
> index a68f5a0a9f37..bfea4e78c577 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -315,6 +315,12 @@ int machine_kexec_prepare(struct kimage *image)
> result = init_pgtable(image, __pa(control_page));
> if (result)
> return result;
> +
> + if (tdx_exclude_feature()) {
> + pr_info_once("Not allowed once TDX has been used.\n");
> + return -EOPNOTSUPP;
> + }
> +
> kexec_va_control_page = (unsigned long)control_page;
> kexec_pa_table_page = (unsigned long)__pa(image->arch.pgd);
>
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index f5e2a937c1e7..9b1f42a1059c 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1215,6 +1215,21 @@ int tdx_enable(void)
> }
> EXPORT_SYMBOL_GPL(tdx_enable);
>
> +bool tdx_exclude_feature(void)
> +{
> + bool ret = false;
> +
> + mutex_lock(&tdx_module_lock);
> + if (tdx_module_status == TDX_MODULE_INITIALIZED)
> + ret = true;
> + else
> + tdx_module_status = TDX_MODULE_EXCLUDED;
> + mutex_lock(&tdx_module_lock);
> +
> + return ret;
> +}
Assuming setting module status to "excluded" means we are not able to initialize
TDX module for ever.
The thing is Kexec has two phases: 1) loading kernel image, and 2) actually do
kexec. Your approach basically marks TDX unusable for ever when a user tries to
load a kxec kernel image, but this is a little bit nasty because loading kexec
kernel image successfully doesn't mean you have to actually do the kexec, i.e.,
you can unload the image and move on.
I am not saying this doesn't work, but IMHO it is more straightforward to just
let user make decision via kernel parameter.
Powered by blists - more mailing lists