linux-kernel - Re: [RFC][PATCH 6/8] mm: handle_speculative

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.1001051120430.3630@localhost.localdomain>
Date:	Tue, 5 Jan 2010 11:28:57 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Christoph Lameter <cl@...ux-foundation.org>
cc:	Andi Kleen <andi@...stfloor.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Minchan Kim <minchan.kim@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"hugh.dickins" <hugh.dickins@...cali.co.uk>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault()

On Tue, 5 Jan 2010, Christoph Lameter wrote:
> 
> The wait state is the processor being stopped due to not being able to
> access the cacheline. Not the processor spinning in the xadd loop. That
> only occurs if the critical section is longer than the timeout.

You don't know what you're talking about, do you?

Just go and read the source code.

The process is spinning currently in the spin_lock loop. Here, I'll quote 
it to you:

                LOCK_PREFIX "xaddw %w0, %1\n"
                "1:\t"
                "cmpb %h0, %b0\n\t"
                "je 2f\n\t"
                "rep ; nop\n\t"
                "movb %1, %b0\n\t"
                /* don't need lfence here, because loads are in-order */
                "jmp 1b\n"

note the loop that spins - reading the thing over and over - waiting for 
_that_ CPU to be the owner of the xadd ticket?

That's the one you have now, only because x86-64 uses the STUPID FALLBACK 
CODE for the rwsemaphores!

In contrast, look at what the non-stupid rwsemaphore code does (which 
triggers on x86-32):

                     LOCK_PREFIX "  incl      (%%eax)\n\t"
                     /* adds 0x00000001, returns the old value */
                     "  jns        1f\n"
                     "  call call_rwsem_down_read_failed\n"

(that's a "down_read()", which happens to be the op we care most about. 
See? That's a single locked "inc" (it avoids the xadd on the read side 
because of how we've biased things). In particular, notice how this means 
that we do NOT have fifty million CPU's all trying to read the same 
location while one writes to it successfully.

Spot the difference?

Here's putting it another way. Which of these schenarios do you think 
should result in less cross-node traffic:

 - multiple CPU's that - one by one - get the cacheline for exclusive 
   access.

 - multiple CPU's that - one by one - get the cacheline for exclusive 
   access, while other CPU's are all trying to read the same cacheline at 
   the same time, over and over again, in a loop.

See the shared part? See the difference? If you look at just a single lock 
acquire, it boils down to these two scenarios

 - one CPU gets the cacheline exclusively

 - one CPU gets the cacheline exclusively while <n> other CPU's are all 
   trying to read the old and the new value.

It really is that simple.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/