lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250628125001.GDaF_k-e2KTo4QlKjl@fat_crate.local>
Date: Sat, 28 Jun 2025 14:50:01 +0200
From: Borislav Petkov <bp@...en8.de>
To: Kai Huang <kai.huang@...el.com>
Cc: dave.hansen@...el.com, tglx@...utronix.de, peterz@...radead.org,
	mingo@...hat.com, hpa@...or.com, thomas.lendacky@....com,
	x86@...nel.org, kirill.shutemov@...ux.intel.com,
	rick.p.edgecombe@...el.com, linux-kernel@...r.kernel.org,
	pbonzini@...hat.com, seanjc@...gle.com, kvm@...r.kernel.org,
	reinette.chatre@...el.com, isaku.yamahata@...el.com,
	dan.j.williams@...el.com, ashish.kalra@....com,
	nik.borisov@...e.com, sagis@...gle.com
Subject: Re: [PATCH v3 1/6] x86/sme: Use percpu boolean to control wbinvd
 during kexec

On Thu, Jun 26, 2025 at 10:48:47PM +1200, Kai Huang wrote:

...

> Doing WBINVD in stop_this_cpu() could potentially increase the chance to
> trigger the above "race" despite it's still rare to happen.

Oh the amount of text... 

Please run it and all your comments through AI to simplify formulations etc.
It is a lot to read.

> Signed-off-by: Kai Huang <kai.huang@...el.com>
> ---
>  arch/x86/include/asm/kexec.h         |  2 +-
>  arch/x86/include/asm/processor.h     |  2 ++
>  arch/x86/kernel/cpu/amd.c            | 16 ++++++++++++++++
>  arch/x86/kernel/machine_kexec_64.c   | 15 ++++++++++-----
>  arch/x86/kernel/process.c            | 16 +++-------------
>  arch/x86/kernel/relocate_kernel_64.S | 15 +++++++++++----
>  6 files changed, 43 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index f2ad77929d6e..d7e93522b93d 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -122,7 +122,7 @@ relocate_kernel_fn(unsigned long indirection_page,
>  		   unsigned long pa_control_page,
>  		   unsigned long start_address,
>  		   unsigned int preserve_context,
> -		   unsigned int host_mem_enc_active);
> +		   unsigned int cache_incoherent);

So preserve_context and cache_incoherent are both a *single* bit of
information. And we use two u32s for that?!?!

How about flags please?

>  #endif
>  extern relocate_kernel_fn relocate_kernel;
>  #define ARCH_HAS_KIMAGE_ARCH
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index bde58f6510ac..a24c7805acdb 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -731,6 +731,8 @@ void __noreturn stop_this_cpu(void *dummy);
>  void microcode_check(struct cpuinfo_x86 *prev_info);
>  void store_cpu_caps(struct cpuinfo_x86 *info);
>  

So much text above - not a single comment here explaining what this var is
for.

> +DECLARE_PER_CPU(bool, cache_state_incoherent);
> +
>  enum l1tf_mitigations {
>  	L1TF_MITIGATION_OFF,
>  	L1TF_MITIGATION_AUTO,
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index f18f540db58c..4c7fde344216 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -503,6 +503,22 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
>  {
>  	u64 msr;
>  
> +	/*
> +	 * Mark using wbinvd is needed during kexec on processors that

For all text: write insns in caps pls - WBINVD.

> +	 * support SME. This provides support for performing a successful
> +	 * kexec when going from SME inactive to SME active (or vice-versa).
> +	 *
> +	 * The cache must be cleared so that if there are entries with the
> +	 * same physical address, both with and without the encryption bit,
> +	 * they don't race each other when flushed and potentially end up
> +	 * with the wrong entry being committed to memory.
> +	 *
> +	 * Test the CPUID bit directly because the machine might've cleared
> +	 * X86_FEATURE_SME due to cmdline options.

Where?

That same function does the clearing later...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ