lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 8 Jan 2016 09:58:26 -0700
From:	Ross Zwisler <ross.zwisler@...ux.intel.com>
To:	Chris Wilson <chris@...is-wilson.co.uk>
Cc:	x86@...nel.org, Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Toshi Kani <toshi.kani@....com>,
	Borislav Petkov <bp@...e.de>,
	"Luis R. Rodriguez" <mcgrof@...e.com>,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	Sai Praneeth <sai.praneeth.prakhya@...el.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: Micro-optimise clflush_cache_range()

On Fri, Jan 08, 2016 at 09:55:33AM +0000, Chris Wilson wrote:
> Whilst inspecting the asm for clflush_cache_range() and some perf profiles
> that required extensive flushing of single cachelines (from part of the
> intel-gpu-tools GPU benchmarks), we noticed that gcc was reloading
> boot_cpu_data.x86_clflush_size on every iteration of the loop. We can
> manually hoist that read which perf regarded as taking ~25% of the
> function time for a single cacheline flush.
> 
> Signed-off-by: Chris Wilson <chris@...is-wilson.co.uk>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: "H. Peter Anvin" <hpa@...or.com>
> Cc: x86@...nel.org
> Cc: Toshi Kani <toshi.kani@....com>
> Cc: Borislav Petkov <bp@...e.de>
> Cc: "Luis R. Rodriguez" <mcgrof@...e.com>
> Cc: Stephen Rothwell <sfr@...b.auug.org.au>
> Cc: Ross Zwisler <ross.zwisler@...ux.intel.com>
> Cc: Sai Praneeth <sai.praneeth.prakhya@...el.com>
> Cc: linux-kernel@...r.kernel.org
> Acked-by: "H. Peter Anvin" <hpa@...or.com>

Looks good to me.

Reviewed-by: Ross Zwisler <ross.zwisler@...ux.intel.com>

> ---
>  arch/x86/mm/pageattr.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index a3137a4feed1..6000ad7f560c 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -129,14 +129,16 @@ within(unsigned long addr, unsigned long start, unsigned long end)
>   */
>  void clflush_cache_range(void *vaddr, unsigned int size)
>  {
> -	unsigned long clflush_mask = boot_cpu_data.x86_clflush_size - 1;
> +	const unsigned long clflush_size = boot_cpu_data.x86_clflush_size;
> +	void *p = (void *)((unsigned long)vaddr & ~(clflush_size - 1));
>  	void *vend = vaddr + size;
> -	void *p;
> +
> +	if (p >= vend)
> +		return;
>  
>  	mb();
>  
> -	for (p = (void *)((unsigned long)vaddr & ~clflush_mask);
> -	     p < vend; p += boot_cpu_data.x86_clflush_size)
> +	for (; p < vend; p += clflush_size)
>  		clflushopt(p);
>  
>  	mb();
> -- 
> 2.7.0.rc3
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ