linux-kernel - Re: [RFC PATCH 0/3] Implement getcpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <484967406.344576.1452644549992.JavaMail.zimbra@efficios.com>
Date:	Wed, 13 Jan 2016 00:22:29 +0000 (UTC)
From:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:	Ben Maurer <bmaurer@...com>
Cc:	Josh Triplett <josh@...htriplett.org>,
	Shane M Seymour <shane.seymour@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Paul Turner <pjt@...gle.com>, Andrew Hunter <ahh@...gle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org,
	linux-api <linux-api@...r.kernel.org>,
	Andy Lutomirski <luto@...capital.net>,
	Andi Kleen <andi@...stfloor.org>,
	Dave Watson <davejwatson@...com>, Chris Lameter <cl@...ux.com>,
	Ingo Molnar <mingo@...hat.com>, rostedt <rostedt@...dmis.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Russell King <linux@....linux.org.uk>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will.deacon@....com>,
	Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: [RFC PATCH 0/3] Implement getcpu_cache system call

----- On Jan 12, 2016, at 4:02 PM, Ben Maurer bmaurer@...com wrote:

>> One idea I have would be to let the kernel reserve some space either after the
>> first stack address (for a stack growing down) or at the beginning of the
>> allocated TLS area for each thread in copy_thread_tls() by fiddling with
>> sp or the tls base address when creating a thread.
> 
> Could this be implemented by having glibc use a well known symbol name to define
> the per-thread TLS area? If an high performance application wants to avoid any
> relocations in accessing this variable it would define it and that definition
> would override glibc's. This is how things work with malloc. glibc has a
> default malloc implementation but we link jemalloc directly into our binaries.
> in addition to changing the malloc implementation this means that calls to
> malloc don't go through the PLT.

Just to make sure I understand your proposal: defining a well known symbol
with a weak attribute in glibc (or bionic...), e.g.:

int32_t __thread __attribute__((weak)) __getcpu_cache;

so that applications which care about bypassing the PLT can override it with:

int32_t __thread __getcpu_cache;

glibc/bionic would be responsible for calling the getcpu_cache() system call
to register/unregister this TLS variable for each thread.

One thing I would like to figure out is whether we can use this in a way that
would allow introducing getcpu_cache() into applications and libraries
(e.g. lttng-ust tracer) before it gets implemented into glibc, in a way that
would keep forward compatibility for whenever it gets introduced in glibc.

We can declare __getcpu_cache as a weak symbol in arbitrary libraries, and
make them register/unregister the cache through the getcpu_cache syscall.
The main thing that I would need to tweak at the kernel level within the
system call would be to keep a refcount of the number of times the
__getcpu_cache is registered per thread. This would allow multiple registrations,
one per library (e.g. lttng-ust) and one for glibc, but we would validate
that they all register the exact same address for a given thread.

The reference counting trick should also work for cases where applications
define a non-weak __getcpu_cache, and want to call the getcpu_cache
system call to register it themselves (before glibc adds support for it).

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com