lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 5 Jan 2010 11:08:46 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
cc:	Christoph Lameter <cl@...ux-foundation.org>,
	Andi Kleen <andi@...stfloor.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"hugh.dickins" <hugh.dickins@...cali.co.uk>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault()



On Tue, 5 Jan 2010, Paul E. McKenney wrote:
> 
> But on many systems, it does take some time for the idle reads to make
> their way to the CPU that just acquired the lock.

Yes. But the point is that there is lots of them.

So think of it this way: every time _one_ CPU acquires a lock (and 
then releases it), _all_ CPU's will read the new value. Imagine the 
cross-socket traffic.

In contrast, doing just a single xadd (which replaces the whole 
"spin_lock+non-atomics+spin_unlock"), every times _once_ CPU cquires a 
lock, that's it. The other CPU's arent' all waiting in line for the lock 
to be released, and reading the cacheline to see if it's their turn.

Sure, after they got the lock they'll all eventually end up reading from 
that cacheline that contains 'struct mm_struct', but that's something we 
could even think about trying to minimize by putting the mmap_sem as far 
away from the other fields as possible.

Now, it's very possible that if you have a broadcast model of cache 
coherency, none of this much matters and you end up with almost all the 
same bus traffic anyway. But I do think it could matter a lot.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ