lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 19 Oct 2015 12:29:14 -0600
From:	Ross Zwisler <ross.zwisler@...ux.intel.com>
To:	Chris Wilson <chris@...is-wilson.co.uk>
Cc:	linux-kernel@...r.kernel.org,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	"H . Peter Anvin" <hpa@...ux.intel.com>,
	Imre Deak <imre.deak@...el.com>,
	Daniel Vetter <daniel.vetter@...ll.ch>,
	dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH] x86: Add an explicit barrier() to clflushopt()

On Mon, Oct 19, 2015 at 10:58:55AM +0100, Chris Wilson wrote:
> During testing we observed that the last cacheline was not being flushed
> from a
> 
> 	mb()
> 	for (addr = addr & -clflush_size; addr < end; addr += clflush_size)
> 		clflushopt();
> 	mb()
> 
> loop (where the initial addr and end were not cacheline aligned).
> 
> Changing the loop from addr < end to addr <= end, or replacing the
> clflushopt() with clflush() both fixed the testcase. Hinting that GCC
> was miscompling the assembly within the loop and specifically the
> alternative within clflushopt() was confusing the loop optimizer.
> 
> Adding a barrier() into clflushopt() is enough for GCC to dtrt, but
> solving why GCC is not seeing the constraints from the alternative_io()
> would be smarter...
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92501
> Testcase: gem_tiled_partial_pwrite_pread/read
> Signed-off-by: Chris Wilson <chris@...is-wilson.co.uk>
> Cc: Ross Zwisler <ross.zwisler@...ux.intel.com>
> Cc: H. Peter Anvin <hpa@...ux.intel.com>
> Cc: Imre Deak <imre.deak@...el.com>
> Cc: Daniel Vetter <daniel.vetter@...ll.ch>
> Cc: dri-devel@...ts.freedesktop.org
> ---
>  arch/x86/include/asm/special_insns.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index 2270e41b32fd..0c7aedbf8930 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -199,6 +199,11 @@ static inline void clflushopt(volatile void *__p)
>  		       ".byte 0x66; clflush %P0",
>  		       X86_FEATURE_CLFLUSHOPT,
>  		       "+m" (*(volatile char __force *)__p));
> +	/* GCC (4.9.1 and 5.2.1 at least) appears to be very confused when
> +	 * meeting this alternative() and demonstrably miscompiles loops
> +	 * iterating over clflushopts.
> +	 */
> +	barrier();
>  }
>  
>  static inline void clwb(volatile void *__p)
> -- 
> 2.6.1

I'm having trouble recreating this - can you help me get a simple reproducer?

I've been using the patch at the end of this mail as a test case.  

If the issue you are seeing is indeed a compiler error, the issue should be
visible in the resulting assembly.  I've dumped the assembly using objdump for
new loop in pmem_init() using both clflush() and clflushopt() in the loop, and
the loops are exactly the same except the clflushopt case has an extra byte as
part of the clflush instruction which is our NOOP prefix - we need this so
that the instruction is the right length so we can swap clflushopt in.

clflush() in loop:
	  3a:	0f ae 3a             	clflush (%rdx)

clflushopt() in loop:
	  3a:	3e 0f ae 3a          	clflush %ds:(%rdx)

We also get an extra entry in the <.altinstr_replacement> section in the
clflushopt case, as expected:

  15:	66                   	data16
  16:	0f ae 3a             	clflush (%rdx)

This all looks correct to me.  I've tested with both GCC 4.9.2 and GCC 5.1.1,
and they behave the same.

I also tried inserting a barrier() in the clflushopt() inline function, and it
didn't change the resulting assembly for me.

What am I missing?

- Ross

-- 8< --
>From 3e8c9cf1205c4505dcc29e09700c2990a3786664 Mon Sep 17 00:00:00 2001
From: Ross Zwisler <ross.zwisler@...ux.intel.com>
Date: Mon, 19 Oct 2015 11:59:02 -0600
Subject: [PATCH] TESTING: understand clflushopt() compilation error

---
 drivers/nvdimm/pmem.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 0ba6a97..3f735a6 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -451,6 +451,20 @@ static int __init pmem_init(void)
 {
 	int error;
 
+	unsigned long clflush_size = boot_cpu_data.x86_clflush_size;
+	const unsigned int size = PAGE_SIZE;
+	void *addr = kmalloc(size, GFP_KERNEL);
+	void *end = addr + size;
+	void *p;
+
+	mb();
+	for (p = (void *)((unsigned long)addr & -clflush_size); p < end;
+			p += clflush_size)
+		clflush(p);
+	mb();
+
+	kfree(addr);
+
 	pmem_major = register_blkdev(0, "pmem");
 	if (pmem_major < 0)
 		return pmem_major;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ