lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1211378410.8297.192.camel@pasglop>
Date:	Wed, 21 May 2008 10:00:10 -0400
From:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:	Trent Piepho <tpiepho@...escale.com>
Cc:	linuxppc-dev@...abs.org, linux-kernel@...r.kernel.org,
	Scott Wood <scottwood@...escale.com>
Subject: Re: [PATCH] [POWERPC] Improve (in|out)_beXX() asm code


> Depends on what you define as "necessary".  It's seem clear that I/O accessors
> _no not_ need to be strictly ordered with respect to normal memory accesses,
> by what's defined in memory-barriers.txt.  So if by "necessary" you mean what
> the Linux standard for I/O accessors requires (and what other archs provide),
> then yes, they have the necessary ordering guarantees.
> 
> But, if you want them to be strictly ordered w.r.t to normal memory, that's
> not the case.

They should be.

> For example, in something like:
> 
> u32 *dmabuf = kmalloc(...);
> ...
> dmabuf[0] = 1;
> out_be32(&regs->dmactl, DMA_SEND_BUFFER);
> dmabuf[0] = 2;
> out_be32(&regs->dmactl, DMA_SEND_BUFFER);
> 
> gcc might decide to optimize this code to:
> 
> out_be32(&regs->dmactl, DMA_SEND_BUFFER);
> out_be32(&regs->dmactl, DMA_SEND_BUFFER);
> dmabuf[0] = 2;

If that's the case, there is a bug. Ignoring gcc possible optimisations,
the accessors contain the necessary memory barriers for things to work
the way you describe above. If the use of volatile and clobber in our
macros isn't enough to also prevent optimisations, then we have a bug
and you are welcome to provide a patch to fix it.

> gcc will often not do this optimization, because there might be aliasing
> between "&regs->dmact" and "dmabuf", but it _can_ do it.  gcc can't optimize
> the two identical out_be32's into one, or re-order them if they were to
> different registers, but it can move the normal memory accesses around them.

The linus kernel -cannot- be compiled with strict aliasing rules. This
is one of the many areas where those are violated. Frankly, this strict
aliasing stuff is just a total nightmare turning a pefectly nice and
useable language into something it's not meant to be.

> Here's a quick hack I stuck in a driver to test.  compile with -save-temps and
> check the resulting asm.  gcc will do the optimization I described above.
> 
> static void __iomem *baz = (void*)0x1234;
> static struct bar {
>      u32 bar[256];
> } bar;
> 
> void foo(void) {
>      bar.bar[0] = 44;
>      out_be32(baz+100, 200);
>      bar.bar[0] = 45;
>      out_be32(baz+101, 201);
> }

Have you removed -fno-strict-aliasing ? Just don't do that.

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ