lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250102124247.GPZ3aJx8JTJa6PcaOW@fat_crate.local>
Date: Thu, 2 Jan 2025 13:42:47 +0100
From: Borislav Petkov <bp@...en8.de>
To: Rik van Riel <riel@...riel.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
	dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
	tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
	akpm@...ux-foundation.org, nadav.amit@...il.com,
	zhengqi.arch@...edance.com, linux-mm@...ck.org
Subject: Re: [PATCH 05/12] x86/mm: add INVLPGB support code

On Mon, Dec 30, 2024 at 12:53:06PM -0500, Rik van Riel wrote:
> Add invlpgb.h with the helper functions and definitions needed to use
> broadcast TLB invalidation on AMD EPYC 3 and newer CPUs.
> 
> Signed-off-by: Rik van Riel <riel@...riel.com>
> ---
>  arch/x86/include/asm/invlpgb.h  | 93 +++++++++++++++++++++++++++++++++
>  arch/x86/include/asm/tlbflush.h |  1 +
>  2 files changed, 94 insertions(+)
>  create mode 100644 arch/x86/include/asm/invlpgb.h
> 
> diff --git a/arch/x86/include/asm/invlpgb.h b/arch/x86/include/asm/invlpgb.h
> new file mode 100644
> index 000000000000..862775897a54
> --- /dev/null
> +++ b/arch/x86/include/asm/invlpgb.h

I don't see the point for a separate header just for that. We have
arch/x86/include/asm/tlb.h.

> @@ -0,0 +1,93 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_X86_INVLPGB
> +#define _ASM_X86_INVLPGB
> +
> +#include <vdso/bits.h>
> +
> +/*
> + * INVLPGB does broadcast TLB invalidation across all the CPUs in the system.
> + *
> + * The INVLPGB instruction is weakly ordered, and a batch of invalidations can
> + * be done in a parallel fashion.
> + *
> + * TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from
> + * this CPU have completed.
> + */
> +static inline void __invlpgb(unsigned long asid, unsigned long pcid, unsigned long addr,
> +			    int extra_count, bool pmd_stride, unsigned long flags)

See below. Once you prune the functions you're not using in your patchset,
this argument list will drop too. We can always extend it later, if really
needed, so let's keep it simple here.

I had slimmed it down to this internally:

static inline void invlpgb(unsigned long va, unsigned long count, unsigned long id)
{
       /* INVLPGB; supported in binutils >= 2.36. */
	asm volatile(".byte 0x0f, 0x01, 0xfe"
		     : : "a" (va), "c" (count), "d" (id)
		     : "memory");
}

I had the memory clobber too but now that I think of it, it probably isn't
needed because even if the compiler reorders INVLPGB, it is weakly-ordered
anyway.

TLBSYNC should probably have a memory clobber tho, to prevent the compiler
from doing funky stuff...

> +{
> +	u64 rax = addr | flags;
> +	u32 ecx = (pmd_stride << 31) | extra_count;
> +	u32 edx = (pcid << 16) | asid;
> +
> +	asm volatile("invlpgb" : : "a" (rax), "c" (ecx), "d" (edx));

No, you do:

	/* INVLPGB; supported in binutils >= 2.36. */
 	asm volatile(".byte 0x0f, 0x01, 0xfe"
	...


> +/*
> + * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination
> + * of the three. For example:
> + * - INVLPGB_VA | INVLPGB_INCLUDE_GLOBAL: invalidate all TLB entries at the address
> + * - INVLPGB_PCID:              	  invalidate all TLB entries matching the PCID
		     ^^^^^^^^^^^^^^^^^^^^^^

Whitespace damage here. Needs tabs.

> + *
> + * The first can be used to invalidate (kernel) mappings at a particular
> + * address across all processes.
> + *
> + * The latter invalidates all TLB entries matching a PCID.
> + */
> +#define INVLPGB_VA			BIT(0)
> +#define INVLPGB_PCID			BIT(1)
> +#define INVLPGB_ASID			BIT(2)
> +#define INVLPGB_INCLUDE_GLOBAL		BIT(3)
> +#define INVLPGB_FINAL_ONLY		BIT(4)
> +#define INVLPGB_INCLUDE_NESTED		BIT(5)

Please add only the defines which are actually being used. Ditto for the
functions.

> +/* Wait for INVLPGB originated by this CPU to complete. */
> +static inline void tlbsync(void)
> +{
> +	asm volatile("tlbsync");
> +}

	/* TLBSYNC; supported in binutils >= 2.36. */
 	asm volatile(".byte 0x0f, 0x01, 0xff" ::: "memory");

> +
> +#endif /* _ASM_X86_INVLPGB */
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index 7d1468a3967b..20074f17fbcd 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -10,6 +10,7 @@
>  #include <asm/cpufeature.h>
>  #include <asm/special_insns.h>
>  #include <asm/smp.h>
> +#include <asm/invlpgb.h>
>  #include <asm/invpcid.h>
>  #include <asm/pti.h>
>  #include <asm/processor-flags.h>
> -- 
> 2.47.1
> 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ