lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 17 Jun 2014 12:40:17 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Christoph Lameter <cl@...two.org>
Cc:	Tejun Heo <tj@...nel.org>, David Howells <dhowells@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Oleg Nesterov <oleg@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] percpu: add data dependency barrier in percpu
 accessors and operations

On Tue, Jun 17, 2014 at 02:27:43PM -0500, Christoph Lameter wrote:
> On Thu, 12 Jun 2014, Tejun Heo wrote:
> 
> > percpu areas are zeroed on allocation and, by its nature, accessed
> > from multiple cpus.  Consider the following scenario.
> 
> I am not sure that the premise is actually right. Percpu areas are
> designed to be accessed from a single cpu and we provide instances
> of variables for each cpu.
> 
> There is no synchronization guarantee for accesses from other cpu. If
> these accesses occur then we tolerate some fuzziness and usualy only do
> read accesses. F.e. for statistics if we loop over all cpus to get a sum
> of percpu counters (which is a classic use case for percpu data).
> 
> But there are numerous uses where no accesses from other cpus are required
> (mostly when percpu stuff is not used for statistics but for cpu local
> lists and status).
> 
> Cross cpu write accesses typically occur only after the allocation and
> before the code that actually does something is aware of the existence of
> the percpu area allocated or if the processor is being offlines/onlines.
> 
>  > >  p = NULL; >
> > 	CPU-1				CPU-2
> >  p = alloc_percpu()		if (p)
> > 					WARN_ON(this_cpu_read(*p));
> 
> p is an offset into the per cpu area of the processor. The value of P
> first has to be made available to cpu2 somehow and this usually provides
> the opportunity for synchronization that avoids the above scenario.
> 
> And so it is typical that these offsets are stored in larger structs that
> also have other means of synchronization.
> 
> F.e. Allocators take a global lock and then instantiate a new
> structure with the associated per cpu area allocation which is added to a
> global list after it is ready. The address of the allocator structure
> is then made available to other processors.
> 
> Another method is to perform this allocation on bootup which then also
> does not require synchronization (page allocator).
> 
> Similar in swapon(). The percpu allocation is performed before access to
> the containing structure (via enable_swap_info).

Those are indeed common use cases.  However...

There is code where one CPU writes to another CPU's per-CPU variables.
One example is RCU callback offloading, where a kernel thread (which
might be running anywhere) dequeues a given CPU's RCU callbacks and
processes them.  The act of dequeuing requires write access to that
CPU's per-CPU rcu_data structure.  And yes, atomic operations and memory
barriers are of course required to make this work.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ