linux-kernel - Re: [PATCH RFC] percpu: add data dependency barrier in percpu accessors and operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 15 Jul 2014 11:12:20 -0500 (CDT)
From:	Christoph Lameter <cl@...two.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Tejun Heo <tj@...nel.org>, David Howells <dhowells@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] percpu: add data dependency barrier in percpu
 accessors and operations

On Tue, 15 Jul 2014, Linus Torvalds wrote:

> Really, "before" and "after" have ABSOLUTELY NO MEANING unless you
> have a barrier. And you're arguing against those barriers. So you
> cannot use "before" as an argument, since in your world, no such thing
> even exists!

I mentioned that there is a barrier because the process of handing over
the offset to the other includes synchronization. In the slab case this is
a semaphore that is use to protect the structure and the list of
kmem_cache structures. The control struct containing the offset must be
entered somehow into something that tracks it for the future and thus
there is synchronization by the subsytem.

> > There are other arguments, but they basically boil down to "no other
> CPU ever accesses the per-cpu data of *this* CPU" (wrong) or "the
> users will do their own barriers" (maybe true, maybe not). Your "value
> is only available after" argument really isn't an argument. Not
> without those barriers.

Ok so what is happening is:

1. cacheline is zeroed on per_cpu_alloc but still exists in remote processor.

(we could actually insert code in alloc_percpu to ensure that the remote
caches are cleaned and not proceed unless that is complete. allocpercpu
is not performance critical).

2. cacheline is initialized with new values by the subsystem looping over
all percpu instances. Other processor still keeps the old data.

3. mutex is taken, list modifications occur, mutex is released. Remote
processor still keeps the old cacheline data.

4. Subsystem makes the percpu offset available.

5. The remote processor is processing using its instance of the per cpu
data for the first time using the offset to determine the percpu data for
its data. This typically means its updating the cacheline (and we hope
that the cacheline will be in exclusive state for good for performance reasons).

And now we still see the old data. The cacheline changes of the initial
processor are ignored?

Ok if this is the case then we have another way of dealing with this in
alloc_percpu. Either zap the relevant remote cpu caches after the areas
were zeroed or do an IPI to make the remote processor run the percpu area
initialization.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/