linux-kernel - Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 7 Jan 2010 08:56:59 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	akpm@...ux-foundation.org, josh@...htriplett.org,
	tglx@...utronix.de, peterz@...radead.org, rostedt@...dmis.org,
	Valdis.Kletnieks@...edu, dhowells@...hat.com, laijs@...fujitsu.com,
	dipankar@...ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory
	barrier

On Thu, Jan 07, 2010 at 10:50:26AM +0100, Andi Kleen wrote:
> Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> writes:
> 
> > Both the signal-based and the sys_membarrier userspace RCU schemes
> > permit us to remove the memory barrier from the userspace RCU
> > rcu_read_lock() and rcu_read_unlock() primitives, thus significantly
> > accelerating them. These memory barriers are replaced by compiler
> > barriers on the read-side, and all matching memory barriers on the
> > write-side are turned into an invokation of a memory barrier on all
> > active threads in the process. By letting the kernel perform this
> > synchronization rather than dumbly sending a signal to every process
> > threads (as we currently do), we diminish the number of unnecessary wake
> > ups and only issue the memory barriers on active threads. Non-running
> > threads do not need to execute such barrier anyway, because these are
> > implied by the scheduler context switches.
> 
> I'm not sure all this effort is really needed on architectures
> with strong memory ordering.

Even CPUs with strong memory ordering allow later reads to complete
prior to earlier writes, which is enough to cause problems.

That said, some of the lighter-weight schemes sampling ->mm might be
safe on TSO machines.

> > + * The current implementation simply executes a memory barrier in an IPI handler
> > + * on each active cpu. Going through the hassle of taking run queue locks and
> > + * checking if the thread running on each online CPU belongs to the current
> > + * thread seems more heavyweight than the cost of the IPI itself.
> > + */
> > +SYSCALL_DEFINE0(membarrier)
> > +{
> > +	on_each_cpu(membarrier_ipi, NULL, 1);
> 
> Can't you use mm->cpu_vm_mask?

Hmmm...  Acquiring the corresponding lock would certainly make this
safe.  Not sure about lock-less access to it, though.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/