lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 24 Sep 2013 10:10:27 +1000
From:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	Paul Mackerras <paulus@....ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	James Hogan <james.hogan@...tec.com>,
	"James E.J. Bottomley" <jejb@...isc-linux.org>,
	Helge Deller <deller@....de>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	"David S. Miller" <davem@...emloft.net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Anton Blanchard <anton@....ibm.com>
Subject: Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix

On Sun, 2013-09-22 at 15:22 -0700, Linus Torvalds wrote:
>  - use %r13 for the per-thread thread-info pointer instead. A
> per-thread pointer is *not* volatile like the per-cpu base is.

 .../...

> Alternatively, make %r13 point to the percpu side, but make sure that
> you always use an asm accessor to fetch the value. In particular, I
> think you need to make __my_cpu_offset be an inline asm that fetches
> %r13 into some other register. Otherwise you can never get it right.

BTW, that boils down to a choice between using r13 as either a TLS for
current or current_thread_info, or as a per-cpu pointer, which one is
the most performance critical ?

Now in the first case, it seems to me that using it as "current" rather
than "current_thread_info()" is a better idea since we access current a
LOT more overall in the kernel, from there we can find a way to put
thread_info into task struct (via thread struct maybe) to make it a
simple offset from current.

The big pro of that approach is of course that r13 becomes the TLS as
intended, and we can feel a lot more comfortable that we are "safe" vs.
whatever crazyness gcc will come up with next.

The flip side is that per-cpu will remain a load away, so getting the
address of a per-cpu variable would typically be a 3 instruction deal
involving a load and a pair of adds to get to the address, then the
actual per-cpu access proper. This is equivalent to what we have today
(we put the per-cpu offset in the PACA). Using r13 as per-cpu allows to
avoid that first load.

So what's the most worthwhile thing to do here ? I'm leaning toward 1,
ie, stick current in r13 and feel a lot safer about it (I won't have to
scrutinize generated code all over the place to convince myself things
aren't crossing the barriers), and if the thread_info is in the task
struct, that makes accessing it really trivial & fast as well.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ