lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 5 Jul 2023 16:13:23 +0800
From:   Yuan Yao <yuan.yao@...ux.intel.com>
To:     Kai Huang <kai.huang@...el.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-mm@...ck.org, x86@...nel.org, dave.hansen@...el.com,
        kirill.shutemov@...ux.intel.com, tony.luck@...el.com,
        peterz@...radead.org, tglx@...utronix.de, bp@...en8.de,
        mingo@...hat.com, hpa@...or.com, seanjc@...gle.com,
        pbonzini@...hat.com, david@...hat.com, dan.j.williams@...el.com,
        rafael.j.wysocki@...el.com, ashok.raj@...el.com,
        reinette.chatre@...el.com, len.brown@...el.com, ak@...ux.intel.com,
        isaku.yamahata@...el.com, ying.huang@...el.com, chao.gao@...el.com,
        sathyanarayanan.kuppuswamy@...ux.intel.com, nik.borisov@...e.com,
        bagasdotme@...il.com, sagis@...gle.com, imammedo@...hat.com
Subject: Re: [PATCH v12 15/22] x86/virt/tdx: Configure global KeyID on all
 packages

On Tue, Jun 27, 2023 at 02:12:45AM +1200, Kai Huang wrote:
> After the list of TDMRs and the global KeyID are configured to the TDX
> module, the kernel needs to configure the key of the global KeyID on all
> packages using TDH.SYS.KEY.CONFIG.
>
> This SEAMCALL cannot run parallel on different cpus.  Loop all online
> cpus and use smp_call_on_cpu() to call this SEAMCALL on the first cpu of
> each package.
>
> To keep things simple, this implementation takes no affirmative steps to
> online cpus to make sure there's at least one cpu for each package.  The
> callers (aka. KVM) can ensure success by ensuring sufficient CPUs are
> online for this to succeed.
>
> Intel hardware doesn't guarantee cache coherency across different
> KeyIDs.  The PAMTs are transitioning from being used by the kernel
> mapping (KeyId 0) to the TDX module's "global KeyID" mapping.
>
> This means that the kernel must flush any dirty KeyID-0 PAMT cachelines
> before the TDX module uses the global KeyID to access the PAMTs.
> Otherwise, if those dirty cachelines were written back, they would
> corrupt the TDX module's metadata.  Aside: This corruption would be
> detected by the memory integrity hardware on the next read of the memory
> with the global KeyID.  The result would likely be fatal to the system
> but would not impact TDX security.
>
> Following the TDX module specification, flush cache before configuring
> the global KeyID on all packages.  Given the PAMT size can be large
> (~1/256th of system RAM), just use WBINVD on all CPUs to flush.
>
> If TDH.SYS.KEY.CONFIG fails, the TDX module may already have used the
> global KeyID to write the PAMTs.  Therefore, use WBINVD to flush cache
> before returning the PAMTs back to the kernel.  Also convert all PAMTs
> back to normal by using MOVDIR64B as suggested by the TDX module spec,
> although on the platform without the "partial write machine check"
> erratum it's OK to leave PAMTs as is.
>
> Signed-off-by: Kai Huang <kai.huang@...el.com>
> Reviewed-by: Isaku Yamahata <isaku.yamahata@...el.com>
> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> ---
>
> v11 -> v12:
>  - Added Kirill's tag
>  - Improved changelog (Nikolay)
>
> v10 -> v11:
>  - Convert PAMTs back to normal when module initialization fails.
>  - Fixed an error in changelog
>
> v9 -> v10:
>  - Changed to use 'smp_call_on_cpu()' directly to do key configuration.
>
> v8 -> v9:
>  - Improved changelog (Dave).
>  - Improved comments to explain the function to configure global KeyID
>    "takes no affirmative action to online any cpu". (Dave).
>  - Improved other comments suggested by Dave.
>
> v7 -> v8: (Dave)
>  - Changelog changes:
>   - Point out this is the step of "multi-steps" of init_tdx_module().
>   - Removed MOVDIR64B part.
>   - Other changes due to removing TDH.SYS.SHUTDOWN and TDH.SYS.LP.INIT.
>  - Changed to loop over online cpus and use smp_call_function_single()
>    directly as the patch to shut down TDX module has been removed.
>  - Removed MOVDIR64B part in comment.
>
> v6 -> v7:
>  - Improved changelong and comment to explain why MOVDIR64B isn't used
>    when returning PAMTs back to the kernel.
>
>
> ---
>  arch/x86/virt/vmx/tdx/tdx.c | 135 +++++++++++++++++++++++++++++++++++-
>  arch/x86/virt/vmx/tdx/tdx.h |   1 +
>  2 files changed, 134 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index 1992245290de..f5d4dbc11aee 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -31,6 +31,7 @@
>  #include <asm/msr.h>
>  #include <asm/archrandom.h>
>  #include <asm/page.h>
> +#include <asm/special_insns.h>
>  #include <asm/tdx.h>
>  #include "tdx.h"
>
> @@ -577,7 +578,8 @@ static void tdmr_get_pamt(struct tdmr_info *tdmr, unsigned long *pamt_base,
>  	*pamt_size = pamt_sz;
>  }
>
> -static void tdmr_free_pamt(struct tdmr_info *tdmr)
> +static void tdmr_do_pamt_func(struct tdmr_info *tdmr,
> +		void (*pamt_func)(unsigned long base, unsigned long size))
>  {
>  	unsigned long pamt_base, pamt_size;
>
> @@ -590,9 +592,19 @@ static void tdmr_free_pamt(struct tdmr_info *tdmr)
>  	if (WARN_ON_ONCE(!pamt_base))
>  		return;
>
> +	(*pamt_func)(pamt_base, pamt_size);
> +}
> +
> +static void free_pamt(unsigned long pamt_base, unsigned long pamt_size)
> +{
>  	free_contig_range(pamt_base >> PAGE_SHIFT, pamt_size >> PAGE_SHIFT);
>  }
>
> +static void tdmr_free_pamt(struct tdmr_info *tdmr)
> +{
> +	tdmr_do_pamt_func(tdmr, free_pamt);
> +}
> +
>  static void tdmrs_free_pamt_all(struct tdmr_info_list *tdmr_list)
>  {
>  	int i;
> @@ -621,6 +633,41 @@ static int tdmrs_set_up_pamt_all(struct tdmr_info_list *tdmr_list,
>  	return ret;
>  }
>
> +/*
> + * Convert TDX private pages back to normal by using MOVDIR64B to
> + * clear these pages.  Note this function doesn't flush cache of
> + * these TDX private pages.  The caller should make sure of that.
> + */
> +static void reset_tdx_pages(unsigned long base, unsigned long size)
> +{
> +	const void *zero_page = (const void *)page_address(ZERO_PAGE(0));
> +	unsigned long phys, end;
> +
> +	end = base + size;
> +	for (phys = base; phys < end; phys += 64)
> +		movdir64b(__va(phys), zero_page);

Worried write overflow at beginning but then I recalled that
PAMT size is 4KB aligned for 1G/2M/4K entries, thus:

Reviewed-by: Yuan Yao <yuan.yao@...el.com>

> +
> +	/*
> +	 * MOVDIR64B uses WC protocol.  Use memory barrier to
> +	 * make sure any later user of these pages sees the
> +	 * updated data.
> +	 */
> +	mb();
> +}
> +
> +static void tdmr_reset_pamt(struct tdmr_info *tdmr)
> +{
> +	tdmr_do_pamt_func(tdmr, reset_tdx_pages);
> +}
> +
> +static void tdmrs_reset_pamt_all(struct tdmr_info_list *tdmr_list)
> +{
> +	int i;
> +
> +	for (i = 0; i < tdmr_list->nr_consumed_tdmrs; i++)
> +		tdmr_reset_pamt(tdmr_entry(tdmr_list, i));
> +}
> +
>  static unsigned long tdmrs_count_pamt_kb(struct tdmr_info_list *tdmr_list)
>  {
>  	unsigned long pamt_size = 0;
> @@ -898,6 +945,55 @@ static int config_tdx_module(struct tdmr_info_list *tdmr_list, u64 global_keyid)
>  	return ret;
>  }
>
> +static int do_global_key_config(void *data)
> +{
> +	/*
> +	 * TDH.SYS.KEY.CONFIG may fail with entropy error (which is a
> +	 * recoverable error).  Assume this is exceedingly rare and
> +	 * just return error if encountered instead of retrying.
> +	 *
> +	 * All '0's are just unused parameters.
> +	 */
> +	return seamcall(TDH_SYS_KEY_CONFIG, 0, 0, 0, 0, NULL, NULL);
> +}
> +
> +/*
> + * Attempt to configure the global KeyID on all physical packages.
> + *
> + * This requires running code on at least one CPU in each package.  If a
> + * package has no online CPUs, that code will not run and TDX module
> + * initialization (TDMR initialization) will fail.
> + *
> + * This code takes no affirmative steps to online CPUs.  Callers (aka.
> + * KVM) can ensure success by ensuring sufficient CPUs are online for
> + * this to succeed.
> + */
> +static int config_global_keyid(void)
> +{
> +	cpumask_var_t packages;
> +	int cpu, ret = -EINVAL;
> +
> +	if (!zalloc_cpumask_var(&packages, GFP_KERNEL))
> +		return -ENOMEM;
> +
> +	for_each_online_cpu(cpu) {
> +		if (cpumask_test_and_set_cpu(topology_physical_package_id(cpu),
> +					packages))
> +			continue;
> +
> +		/*
> +		 * TDH.SYS.KEY.CONFIG cannot run concurrently on
> +		 * different cpus, so just do it one by one.
> +		 */
> +		ret = smp_call_on_cpu(cpu, do_global_key_config, NULL, true);
> +		if (ret)
> +			break;
> +	}
> +
> +	free_cpumask_var(packages);
> +	return ret;
> +}
> +
>  static int init_tdx_module(void)
>  {
>  	struct tdsysinfo_struct *sysinfo;
> @@ -956,15 +1052,47 @@ static int init_tdx_module(void)
>  	if (ret)
>  		goto out_free_pamts;
>
> +	/*
> +	 * Hardware doesn't guarantee cache coherency across different
> +	 * KeyIDs.  The kernel needs to flush PAMT's dirty cachelines
> +	 * (associated with KeyID 0) before the TDX module can use the
> +	 * global KeyID to access the PAMT.  Given PAMTs are potentially
> +	 * large (~1/256th of system RAM), just use WBINVD on all cpus
> +	 * to flush the cache.
> +	 */
> +	wbinvd_on_all_cpus();
> +
> +	/* Config the key of global KeyID on all packages */
> +	ret = config_global_keyid();
> +	if (ret)
> +		goto out_reset_pamts;
> +
>  	/*
>  	 * TODO:
>  	 *
> -	 *  - Configure the global KeyID on all packages.
>  	 *  - Initialize all TDMRs.
>  	 *
>  	 *  Return error before all steps are done.
>  	 */
>  	ret = -EINVAL;
> +out_reset_pamts:
> +	if (ret) {
> +		/*
> +		 * Part of PAMTs may already have been initialized by the
> +		 * TDX module.  Flush cache before returning PAMTs back
> +		 * to the kernel.
> +		 */
> +		wbinvd_on_all_cpus();
> +		/*
> +		 * According to the TDX hardware spec, if the platform
> +		 * doesn't have the "partial write machine check"
> +		 * erratum, any kernel read/write will never cause #MC
> +		 * in kernel space, thus it's OK to not convert PAMTs
> +		 * back to normal.  But do the conversion anyway here
> +		 * as suggested by the TDX spec.
> +		 */
> +		tdmrs_reset_pamt_all(&tdmr_list);
> +	}
>  out_free_pamts:
>  	if (ret)
>  		tdmrs_free_pamt_all(&tdmr_list);
> @@ -1019,6 +1147,9 @@ static int __tdx_enable(void)
>   * lock to prevent any new cpu from becoming online; 2) done both VMXON
>   * and tdx_cpu_enable() on all online cpus.
>   *
> + * This function requires there's at least one online cpu for each CPU
> + * package to succeed.
> + *
>   * This function can be called in parallel by multiple callers.
>   *
>   * Return 0 if TDX is enabled successfully, otherwise error.
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index c386aa3afe2a..a0438513bec0 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -21,6 +21,7 @@
>  /*
>   * TDX module SEAMCALL leaf functions
>   */
> +#define TDH_SYS_KEY_CONFIG	31
>  #define TDH_SYS_INFO		32
>  #define TDH_SYS_INIT		33
>  #define TDH_SYS_LP_INIT		35
> --
> 2.40.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ