lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 7 Jan 2016 14:29:58 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Chris Wilson <chris@...is-wilson.co.uk>,
	Andy Lutomirski <luto@...capital.net>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	"H . Peter Anvin" <hpa@...ux.intel.com>,
	Borislav Petkov <bp@...en8.de>,
	Brian Gerst <brgerst@...il.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Imre Deak <imre.deak@...el.com>,
	Daniel Vetter <daniel.vetter@...ll.ch>,
	DRI <dri-devel@...ts.freedesktop.org>
Subject: Re: [PATCH] x86: Add an explicit barrier() to clflushopt()

On 01/07/16 13:54, Chris Wilson wrote:
> 
> Whilst you are looking at this asm, note that we reload
> boot_cpu_data.x86_cflush_size everytime around the loop. That's a small
> but noticeable extra cost (especially when we are only flushing a single
> cacheline).
> 

I did notice that; I don't know if this is a compiler failure to do an
obvious hoisting (which should then be merged with the other load) or a
consequence of the volatile.  It might be useful to report this to the
gcc bugzilla to let them look at it.

Either way, the diff looks good and you can add my:

Acked-by: H. Peter Anvin <hpa@...ux.intel.com>

However, I see absolutely nothing wrong with the assembly other than
minor optimization possibilities: since gcc generates an early-out test
as well as a late-out test anyway, we could add an explicit:

	if (p < vend)
		return;

before the first mb() at no additional cost (assuming gcc is smart
enough to skip the second test; otherwise that can easily be done
manually by replacing the for loop with a do { } while loop.

I would be very interested in knowing if replacing the final clflushopt
with a clflush would resolve your problems (in which case the last mb()
shouldn't be necessary either.)

Something like:

void clflush_cache_range(void *vaddr, unsigned int size)
{
	unsigned long clflush_size = boot_cpu_data.x86_clflush_size;
	char *vend = (char *)vaddr + size;
	char *p = (char *)((unsigned long)vaddr & ~(clflush_size - 1);

	if (p >= vend)
		return;

	mb();

	for (; p < vend - clflush_size; p += clflush_size)
		clflushopt(p);

	clflush(p);		/* Serializing and thus a barrier */
}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ