lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Jun 2008 14:06:27 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	benh@...nel.crashing.org
Cc:	Jesse Barnes <jbarnes@...tuousgeek.org>,
	linux-arch@...r.kernel.org, Roland Dreier <rdreier@...co.com>,
	James Bottomley <James.Bottomley@...senpartnership.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Matthew Wilcox <matthew@....cx>,
	Trent Piepho <tpiepho@...escale.com>,
	Russell King <rmk+lkml@....linux.org.uk>,
	David Miller <davem@...emloft.net>, scottwood@...escale.com,
	linuxppc-dev@...abs.org, alan@...rguk.ukuu.org.uk,
	linux-kernel@...r.kernel.org
Subject: Re: MMIO and gcc re-ordering issue

On Wednesday 11 June 2008 13:40, Benjamin Herrenschmidt wrote:
> On Wed, 2008-06-11 at 13:29 +1000, Nick Piggin wrote:
> > Exactly, yes. I guess everybody has had good intentions here, but
> > as noticed, what is lacking is coordination and documentation.
> >
> > You mention strong ordering WRT spin_unlock, which suggests that
> > you would prefer to take option #2 (the current powerpc one): io/io
> > is ordered and io is contained inside spinlocks, but io/cacheable
> > in general is not ordered.
>
> IO/cacheable -is- ordered on powepc in what we believe is the direction
> that matter: IO reads are fully ordered vs. anything and IO writes are
> ordered vs. previous cacheable stores. The only "relaxed" situation is
> IO writes followed by cacheable stores, which I believe shouldn't be
> a problem. (except for spinlocks for which we use the flag trick)

Spinlocks... mutexes, semaphores, rwsems, rwlocks, bit spinlocks, bit
mutexes, open coded bit locks (of which there are a number floating
around in drivers/).

But even assuming you get all of that fixed up. I wonder what is such
a big benefit to powerpc that you'll rather add the exception "cacheable
stores are not ordered with previous io stores" than to say "any driver
which works on x86 will work on powerpc as far as memory ordering goes"?
(don't you also need something to order io reads with cacheable reads?
as per my observation that your rmb is broken according to IBM docs)

Obviously you already have a sync instruction in your writel, so 1)
adding a second one doesn't slow it down by an order of mangnitude or
anything, just some small factor; and 2) you obviously want to be
converting high performance drivers to a more relaxed model anyway
regardless of whether there is one sync or two in there.

Has this ever been measured or thought carefully about? Beyond the
extent of "one sync good, two sync bad" ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ