[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b0a05883-befc-05f5-0bb7-d59b257238a6@arm.com>
Date: Thu, 20 Sep 2018 11:28:39 +0100
From: Szabolcs Nagy <szabolcs.nagy@....com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: nd@....com, carlos <carlos@...hat.com>,
Florian Weimer <fweimer@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ben Maurer <bmaurer@...com>,
Peter Zijlstra <peterz@...radead.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Boqun Feng <boqun.feng@...il.com>,
Will Deacon <will.deacon@....com>,
Dave Watson <davejwatson@...com>, Paul Turner <pjt@...gle.com>,
libc-alpha <libc-alpha@...rceware.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
linux-api <linux-api@...r.kernel.org>
Subject: Re: [RFC PATCH] glibc: Perform rseq(2) registration at nptl init and
thread creation
On 19/09/18 22:01, Mathieu Desnoyers wrote:
> ----- On Sep 19, 2018, at 1:38 PM, Szabolcs Nagy szabolcs.nagy@....com wrote:
>> note that libpthread.so is built with -ftls-model=initial-exec
>
> Which would indeed make these annotations redundant. I'll remove
> them.
>
>> (and if it wasn't then you'd want to put the attribute on the
>> declaration in the internal header file, not on the definition,
>> so the actual tls accesses generate the right code)
>
> This area is one where I'm still uneasy on my comprehension of
> the details, especially that it goes in a different direction than
> what you are recommending.
>
> I've read through https://www.akkadia.org/drepper/tls.pdf Section 5
> "Linker Optimizations" to try to figure it out, and I end up being
> under the impression that applying the tls_model("initial-exec")
> attribute to a symbol declaration in a header file does not have
> much impact on the accesses that use that variable. Reading through
> that section, it seems that the variable definition is the one that
> matters, and then the compiler/linker/loader are tweaking the sites
> that reference the TLS variable through code rewrite based on the
> most efficient mechanism that each phase knows can be used at each
> stage.
>
> What am I missing ?
in general if you rely on linker relaxations you may not
get optimal code because the linker cannot remove
instructions, just nop them out.
(e.g. on aarch64 an initial-exec access is 4 instructions
a general dynamic (tlsdesc) access is 6 instructions +
it involves a call, so the return address has to be saved
and restored (+ 3 instructions for stack operations if
there were none otherwise, which the linker cannot change))
Powered by blists - more mailing lists