lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0902130808050.3099@localhost.localdomain>
Date:	Fri, 13 Feb 2009 08:18:46 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Mathieu Desnoyers <compudj@...stal.dyndns.org>
cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Bryan Wu <cooloney@...nel.org>, linux-kernel@...r.kernel.org,
	ltt-dev@...ts.casi.polymtl.ca,
	uclinux-dist-devel@...ckfin.uclinux.org
Subject: Re: [ltt-dev] [RFC git tree] Userspace RCU (urcu) for Linux
 (repost)



On Fri, 13 Feb 2009, Mathieu Desnoyers wrote:
> 
> I created also
> 
> _STORE_SHARED()
> _LOAD_SHARED()
> 
> which identify the variables which need to have cache flush done before
> (load) or after (store). So we get both speed and identification when
> needed (if we need to do batch updates linked with a single cache flush).
> e.g.

The thing is, THAT JUST ABSOLUTELY SUCKS.

Lookie here - we don't want to flush the cache at every load of a shared 
variable. There's no reason to. If you don't care about the orderign, you 
might as well get the old values. That's what memory ordering _means_, for 
chissake! In the absense of locks, loads may get stale values. It's that 
easy.

A lot of code wants to access multiple variables, and they are potentially 
nearby, and in the same cacheline. Making them all use _LOAD_SHARED() adds 
absolutely no value - and makes it MUCH MUCH SLOWER.

So what's the answer?

I already outlined it: either you use locks (which will do the magic for 
you), or you use memory barriers. In no case do you make the access magic, 
unless you have a compiler issue where you are afraid that the compiler 
would turn it into _multiple_ accesses and potentially get inconsistent 
results.

So the point about ACCESS_ONCE() is not, and never has been, about 
re-ordering. We know that the CPU may re-order the accesses and give us 
stale values (or values from the "future" wrt the other accesses around 
it). That's not the point. The point of ACCESS_ONCE() is that we get 
exactly _one_ value, and not two different ones (or none at all) because 
of the compiler either re-loading it several times or not re-loading it at 
all.

Anybody who confuses ACCESS_ONCE() with ordering is simply confused.

And we don't want to make any "load with cache flush" either. Which side 
should the cache flush be on? Before? After? Both? Atomically? There is no 
sane semantics for that.

The only remaining sane semantics is to depend on memory barriers, and 
then make a magic memory barrier that is extra weak and doesn't order 
anythign at all, but just says "syncronize very weakly".

And I think we have that in "cpu_relax()". Because if you have somebody 
doing shared memory accesses in a loop without any memory barriers or 
locks or anything (ie the _ordering_ doesn't matter, only that some value 
has been seen), then dang it, I can't see how you can _possibly_ use 
anything else than that "cpu_relax()" somewhere in that loop.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ