lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Tue, 15 Jul 2014 10:45:19 -0700
From:	"Paul E. McKenney" <>
To:	Linus Torvalds <>
Cc:	Christoph Lameter <>,
	Kernel Mailing List <>,
	Andrew Morton <>,
	Rusty Russell <>,
	Tejun Heo <>, David Howells <>,
	Oleg Nesterov <>
Subject: Re: [PATCH RFC] percpu: add data dependency barrier in percpu
 accessors and operations

On Tue, Jul 15, 2014 at 10:32:13AM -0700, Linus Torvalds wrote:
> On Jul 15, 2014 9:12 AM, "Christoph Lameter" <> wrote:
> >
> > I mentioned that there is a barrier because the process of handing over
> > the offset to the other includes synchronization. In the slab case this is
> > a semaphore that is use to protect the structure and the list of
> > kmem_cache structures. The control struct containing the offset must be
> > entered somehow into something that tracks it for the future and thus
> > there is synchronization by the subsytem.
> Maybe. It's not at all obvious, though. That control structure may be a
> lockless percpu thing, after all. We have front end caches to the
> allocators etc..
> So yes, if there is a lock+unlock, that can be a memory barrier - but even
> that is not necessarily true on all architectures.
> Is there even always one, though? And what about the other CPU? Does it
> have any paired barrier?
> > And now we still see the old data. The cacheline changes of the initial
> > processor are ignored?
> They aren't "ignored". But without the proper barriers, the zeroing writes
> may still be buffered on the other CPU, and may not even have had a cache
> line allocated to them!
> That's the kind of thing that makes a barrier possibly required.
> But yes, the barrier may be implied by other synchronization if that
> exists. A lock doesn't necessarily help, though. A write before a lock can
> migrate down into the locked region, and a write after the lock may
> similarly migrate into the locked region, and then two such writes may be
> seen out of order on another CPU that doesn't take the lock itself to
> serialize.
> See? It's those kinds of things that can cause really subtle memory
> ordering problems. We generally never see them on x86, since writes are
> always ordered against other writes, and reads are allays ordered wrt other
> reads (but not reads against writes) and all locks are full memory barriers.
> But on other architectures even a lock+unlock may not be a barrier, since
> things moving into the locked region is fine (the lock means that things
> had better not move *out* of s locked region).
> So depending on serialization may not always be all that obvious. Even if
> it does happen to all work on x86.
> And don't get me wrong. I despise weak memory models. I think they are a
> mistake. I absolutely detest the alpha model, and I think PowerPC and ARM
> are wrong too. It's just that we have to try to care about their broken
> crap )^:

Well at least all the required barriers are free on TSO systems like
x86 and mainframe, and the read-side barriers are free evne on ARM and
PowerPC (not so much on DEC Alpha, but you can't have everything).  OK,
OK, there is a volatile cast and/or a barrier() directive that might
constrain the compiler a bit in some cases.

So this might well be crap, but at least it is (nearly) zero-cost crap
from the viewpoint of TSO machines like x86.  ;-)

							Thanx, Paul

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Powered by blists - more mailing lists