[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9dca3a1d-eace-07ed-4cd2-09621912314a@intel.com>
Date: Fri, 6 Jan 2023 14:49:23 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Kai Huang <kai.huang@...el.com>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org
Cc: linux-mm@...ck.org, peterz@...radead.org, tglx@...utronix.de,
seanjc@...gle.com, pbonzini@...hat.com, dan.j.williams@...el.com,
rafael.j.wysocki@...el.com, kirill.shutemov@...ux.intel.com,
ying.huang@...el.com, reinette.chatre@...el.com,
len.brown@...el.com, tony.luck@...el.com, ak@...ux.intel.com,
isaku.yamahata@...el.com, chao.gao@...el.com,
sathyanarayanan.kuppuswamy@...ux.intel.com, bagasdotme@...il.com,
sagis@...gle.com, imammedo@...hat.com
Subject: Re: [PATCH v8 13/16] x86/virt/tdx: Configure global KeyID on all
packages
On 12/8/22 22:52, Kai Huang wrote:
> After the list of TDMRs and the global KeyID are configured to the TDX
> module, the kernel needs to configure the key of the global KeyID on all
> packages using TDH.SYS.KEY.CONFIG.
>
> TDH.SYS.KEY.CONFIG needs to be done on one (any) cpu for each package.
> Also, it cannot run concurrently on different cpus, so just use
> smp_call_function_single() to do it one by one.
>
> Note to keep things simple, neither the function to configure the global
> KeyID on all packages nor the tdx_enable() checks whether there's at
> least one online cpu for each package. Also, neither of them explicitly
> prevents any cpu from going offline. It is caller's responsibility to
> guarantee this.
OK, but does someone *actually* do this?
> Intel hardware doesn't guarantee cache coherency across different
> KeyIDs. The kernel needs to flush PAMT's dirty cachelines (associated
> with KeyID 0) before the TDX module uses the global KeyID to access the
> PAMT. Otherwise, those dirty cachelines can silently corrupt the TDX
> module's metadata. Note this breaks TDX from functionality point of
> view but TDX's security remains intact.
Intel hardware doesn't guarantee cache coherency across
different KeyIDs. The PAMTs are transitioning from being used
by the kernel mapping (KeyId 0) to the TDX module's "global
KeyID" mapping.
This means that the kernel must flush any dirty KeyID-0 PAMT
cachelines before the TDX module uses the global KeyID to access
the PAMT. Otherwise, if those dirty cachelines were written
back, they would corrupt the TDX module's metadata. Aside: This
corruption would be detected by the memory integrity hardware on
the next read of the memory with the global KeyID. The result
would likely be fatal to the system but would not impact TDX
security.
> Following the TDX module specification, flush cache before configuring
> the global KeyID on all packages. Given the PAMT size can be large
> (~1/256th of system RAM), just use WBINVD on all CPUs to flush.
>
> Note if any TDH.SYS.KEY.CONFIG fails, the TDX module may already have
> used the global KeyID to write any PAMT. Therefore, need to use WBINVD
> to flush cache before freeing the PAMTs back to the kernel.
s/need to// ^
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index ab961443fed5..4c779e8412f1 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -946,6 +946,66 @@ static int config_tdx_module(struct tdmr_info_list *tdmr_list, u64 global_keyid)
> return ret;
> }
>
> +static void do_global_key_config(void *data)
> +{
> + int ret;
> +
> + /*
> + * TDH.SYS.KEY.CONFIG may fail with entropy error (which is a
> + * recoverable error). Assume this is exceedingly rare and
> + * just return error if encountered instead of retrying.
> + */
> + ret = seamcall(TDH_SYS_KEY_CONFIG, 0, 0, 0, 0, NULL, NULL);
> +
> + *(int *)data = ret;
> +}
> +
> +/*
> + * Configure the global KeyID on all packages by doing TDH.SYS.KEY.CONFIG
> + * on one online cpu for each package. If any package doesn't have any
> + * online
This looks like it stopped mid-sentence.
> + * Note:
> + *
> + * This function neither checks whether there's at least one online cpu
> + * for each package, nor explicitly prevents any cpu from going offline.
> + * If any package doesn't have any online cpu then the SEAMCALL won't be
> + * done on that package and the later step of TDX module initialization
> + * will fail. The caller needs to guarantee this.
> + */
*Does* the caller guarantee it?
You're basically saying, "this code needs $FOO to work", but you're not
saying who *provides* $FOO.
> +static int config_global_keyid(void)
> +{
> + cpumask_var_t packages;
> + int cpu, ret = 0;
> +
> + if (!zalloc_cpumask_var(&packages, GFP_KERNEL))
> + return -ENOMEM;
> +
> + for_each_online_cpu(cpu) {
> + int err;
> +
> + if (cpumask_test_and_set_cpu(topology_physical_package_id(cpu),
> + packages))
> + continue;
> +
> + /*
> + * TDH.SYS.KEY.CONFIG cannot run concurrently on
> + * different cpus, so just do it one by one.
> + */
> + ret = smp_call_function_single(cpu, do_global_key_config, &err,
> + true);
> + if (ret)
> + break;
> + if (err) {
> + ret = err;
> + break;
> + }
> + }
> +
> + free_cpumask_var(packages);
> + return ret;
> +}
> +
> static int init_tdx_module(void)
> {
> /*
> @@ -998,19 +1058,46 @@ static int init_tdx_module(void)
> if (ret)
> goto out_free_pamts;
>
> + /*
> + * Hardware doesn't guarantee cache coherency across different
> + * KeyIDs. The kernel needs to flush PAMT's dirty cachelines
> + * (associated with KeyID 0) before the TDX module can use the
> + * global KeyID to access the PAMT. Given PAMTs are potentially
> + * large (~1/256th of system RAM), just use WBINVD on all cpus
> + * to flush the cache.
> + *
> + * Follow the TDX spec to flush cache before configuring the
> + * global KeyID on all packages.
> + */
I don't think this second paragraph adds very much clarity.
Powered by blists - more mailing lists