linux-kernel - Re: MMIO and gcc re-ordering issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200806111406.28411.nickpiggin@yahoo.com.au>
Date:	Wed, 11 Jun 2008 14:06:27 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	benh@...nel.crashing.org
Cc:	Jesse Barnes <jbarnes@...tuousgeek.org>,
	linux-arch@...r.kernel.org, Roland Dreier <rdreier@...co.com>,
	James Bottomley <James.Bottomley@...senpartnership.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Matthew Wilcox <matthew@....cx>,
	Trent Piepho <tpiepho@...escale.com>,
	Russell King <rmk+lkml@....linux.org.uk>,
	David Miller <davem@...emloft.net>, scottwood@...escale.com,
	linuxppc-dev@...abs.org, alan@...rguk.ukuu.org.uk,
	linux-kernel@...r.kernel.org
Subject: Re: MMIO and gcc re-ordering issue

On Wednesday 11 June 2008 13:40, Benjamin Herrenschmidt wrote:
> On Wed, 2008-06-11 at 13:29 +1000, Nick Piggin wrote:
> > Exactly, yes. I guess everybody has had good intentions here, but
> > as noticed, what is lacking is coordination and documentation.
> >
> > You mention strong ordering WRT spin_unlock, which suggests that
> > you would prefer to take option #2 (the current powerpc one): io/io
> > is ordered and io is contained inside spinlocks, but io/cacheable
> > in general is not ordered.
>
> IO/cacheable -is- ordered on powepc in what we believe is the direction
> that matter: IO reads are fully ordered vs. anything and IO writes are
> ordered vs. previous cacheable stores. The only "relaxed" situation is
> IO writes followed by cacheable stores, which I believe shouldn't be
> a problem. (except for spinlocks for which we use the flag trick)

Spinlocks... mutexes, semaphores, rwsems, rwlocks, bit spinlocks, bit
mutexes, open coded bit locks (of which there are a number floating
around in drivers/).

But even assuming you get all of that fixed up. I wonder what is such
a big benefit to powerpc that you'll rather add the exception "cacheable
stores are not ordered with previous io stores" than to say "any driver
which works on x86 will work on powerpc as far as memory ordering goes"?
(don't you also need something to order io reads with cacheable reads?
as per my observation that your rmb is broken according to IBM docs)

Obviously you already have a sync instruction in your writel, so 1)
adding a second one doesn't slow it down by an order of mangnitude or
anything, just some small factor; and 2) you obviously want to be
converting high performance drivers to a more relaxed model anyway
regardless of whether there is one sync or two in there.

Has this ever been measured or thought carefully about? Beyond the
extent of "one sync good, two sync bad" ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/