lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <df0750e0-dd6f-7418-53bd-64a9ad1e0086@citrix.com>
Date:   Fri, 14 Jul 2023 21:27:24 +0100
From:   andrew.cooper3@...rix.com
To:     Dave Hansen <dave.hansen@...ux.intel.com>, dave.hansen@...el.com
Cc:     linux-kernel@...r.kernel.org, jannh@...gle.com, x86@...nel.org,
        luto@...nel.org, peterz@...radead.org
Subject: Re: [PATCH] x86/mm: Remove "INVPCID single" feature tracking

On 14/07/2023 7:35 pm, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@...ux.intel.com>
>
> tl;dr: Replace a synthetic X86_FEATURE with a hardware X86_FEATURE
>        and check of existing per-cpu state.
>
> == Background ==
>
> There are three features in play here:
>  1. Good old Page Table Isolation (PTI)
>  2. Process Context IDentifiers (PCIDs) which allow entries from
>     multiple address spaces to be in the TLB at once.
>  3. Support for the "Invalidate PCID" (INVPCID) instruction,
>     specifically the "individual address" mode (aka. mode 0).
>
> When all *three* of these are in place, INVPCID can and should be used
> to flush out individual addresses in the PTI user address space.
>
> But there's a wrinkle or two: First, this INVPCID mode is dependent on
> CR4.PCIDE.  Even if X86_FEATURE_INVPCID==1, the instruction may #GP
> without setting up CR4.

Can the SDM authors go and reconsider their position of (not) including
this condition in the exception list.

Or give up and just point intel.com/sdm at AMD, because AMD do describe
this coherently.

> diff -puN arch/x86/mm/tlb.c~remove-invpcid-single arch/x86/mm/tlb.c
> --- a/arch/x86/mm/tlb.c~remove-invpcid-single	2023-07-14 08:29:08.665225945 -0700
> +++ b/arch/x86/mm/tlb.c	2023-07-14 08:29:08.673225955 -0700
> @@ -1141,20 +1141,24 @@ void flush_tlb_one_kernel(unsigned long
>  STATIC_NOPV void native_flush_tlb_one_user(unsigned long addr)
>  {
>  	u32 loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid);
> +	bool cpu_pcide     = this_cpu_read(cpu_tlbstate.cr4) & X86_CR4_PCIDE;
>  
> +	/* Flush 'addr' from the kernel PCID: */
>  	asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
>  
> +	/* If PTI is off there is no user PCID and nothing to flush. */
>  	if (!static_cpu_has(X86_FEATURE_PTI))
>  		return;

As a minor observation, the common case is for the function to exit
here, but you've got both this_cpu_read()'s ahead of a full compiler
memory barrier.

If you move them here, you'll drop the reads from the common case.  But...

>  
>  	/*
> -	 * Some platforms #GP if we call invpcid(type=1/2) before CR4.PCIDE=1.
> -	 * Just use invalidate_user_asid() in case we are called early.
> +	 * invpcid_flush_one(pcid>0) will #GP if CR4.PCIDE==0.  Check
> +	 * 'cpu_pcide' to ensure that *this* CPU will not trigger those
> +	 * #GP's even if called before CR4.PCIDE has been initialized.
>  	 */
> -	if (!this_cpu_has(X86_FEATURE_INVPCID_SINGLE))
> -		invalidate_user_asid(loaded_mm_asid);
> -	else
> +	if (boot_cpu_has(X86_FEATURE_INVPCID) && cpu_pcide)

... why can't this just be && loaded_mm_asid ?

There's no plausible way the asid can be nonzero here without CR4.PCIDE
being set, and that avoids looking at cr4 directly.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ