linux-kernel - Re: [PATCH 1/5] glibc: Perform rseq(2) registration at C startup and thread creation (v10)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1239705947.14878.1558985272873.JavaMail.zimbra@efficios.com>
Date:   Mon, 27 May 2019 15:27:52 -0400 (EDT)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Florian Weimer <fweimer@...hat.com>
Cc:     carlos <carlos@...hat.com>, Joseph Myers <joseph@...esourcery.com>,
        Szabolcs Nagy <szabolcs.nagy@....com>,
        libc-alpha <libc-alpha@...rceware.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ben Maurer <bmaurer@...com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Will Deacon <will.deacon@....com>,
        Dave Watson <davejwatson@...com>, Paul Turner <pjt@...gle.com>,
        Rich Felker <dalias@...c.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-api <linux-api@...r.kernel.org>
Subject: Re: [PATCH 1/5] glibc: Perform rseq(2) registration at C startup
 and thread creation (v10)

----- On May 27, 2019, at 7:19 AM, Florian Weimer fweimer@...hat.com wrote:

> * Mathieu Desnoyers:
> 
>> +/* volatile because fields can be read/updated by the kernel.  */
>> +__thread volatile struct rseq __rseq_abi = {
>> +  .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
>> +};
> 
> As I've explained repeatedly, the volatile qualifier is wrong because it
> is impossible to get rid of it.  (Accessing an object declared volatile
> using non-volatile pointers is undefined.)  Code using __rseq_abi should
> use relaxed MO atomics or signal fences/compiler barriers, as
> appropriate.

Hi Florian,

OK. So let's remove the volatile.

This means the sched_getcpu() implementation will need to load __rseq_abi.cpu_id
with a atomic_load_relaxed(), am I correct ?

This field can be updated at by the kernel at any point of user-space execution
due to preemption, so we need to ensure the load is performed as a single
instruction to prevent the compiler from doing load tearing, and to force it
to re-fetch the value within loops.

It would become:

int
sched_getcpu (void)
{
  int cpu_id = atomic_load_relaxed (&__rseq_abi.cpu_id);

  return cpu_id >= 0 ? cpu_id : vsyscall_sched_getcpu ();
}

> 
>> +/* Advertise Restartable Sequences registration ownership across
>> +   application and shared libraries.
>> +
>> +   Libraries and applications must check whether this variable is zero or
>> +   non-zero if they wish to perform rseq registration on their own. If it
>> +   is zero, it means restartable sequence registration is not handled, and
>> +   the library or application is free to perform rseq registration. In
>> +   that case, the library or application is taking ownership of rseq
>> +   registration, and may set __rseq_handled to 1. It may then set it back
>> +   to 0 after it completes unregistering rseq.
>> +
>> +   If __rseq_handled is found to be non-zero, it means that another
>> +   library (or the application) is currently handling rseq registration.
>> +
>> +   Typical use of __rseq_handled is within library constructors and
>> +   destructors, or at program startup.  */
>> +
>> +int __rseq_handled;
> 
> It's not clear to me whether the intent is that __rseq_handled reflects
> kernel support for rseq or not.

If __rseq_handled is set, it means a library is managing the rseq registration.
It is independent from the fact that the kernel supports rseq or not.

If e.g. glibc manages rseq registration, it sets __rseq_handled to 1. It will
then query the kernel for rseq availability. If the kernel happens to not
support rseq, the __rseq_abi.cpu_id will be set to RSEQ_CPU_ID_REGISTRATION_FAILED,
which means the registration has failed.

The kernel does not support rseq in that scenario, and it would be pointless
for an early adopter library to try to also register it.

As soon as a library changes the state of __rseq_abi.cpu_id, it is indeed
managing rseq registration. Perhaps the meaning of "handling" rseq registration
should be clarified in the comment.

> Currently, it only tells us whether
> glibc has been built with rseq support or not.  It does not reflect
> kernel support.

We know we have kernel support if __rseq_abi.cpu_id >= 0.

>  I'm still not convinced that this symbol is necessary,
> especially if we mandate a kernel header version which defines __NR_rseq
> for building glibc (which may happen due to the time64_t work).

__NR_rseq is not yet supported by all Linux architectures. So we will need
to support building glibc against kernel headers that do not define __NR_rseq
for quite a while anyway.

Moreover, this does not solve the issue tackled by __rseq_handled: early
adopter libraries managing rseq registration built against older glibc
versions which eventually end up running within a process linked against
a newer glibc which handles rseq registration.

> 
> Furthermore, the reference to ELF constructors is misleading.  I believe
> the code you added to __libc_start_main to initialize __rseq_handled and
> register __seq_abi with the kernel runs *after* ELF constructors have
> executed (and not at all if the main program is written in Go, alas).
> All initialization activity for the shared case needs to happen in
> elf/rtld.c or called from there, probably as part of the security
> initialization code or thereabouts.

in elf/rtld.c:dl_main() we have the following code:

  /* We do not initialize any of the TLS functionality unless any of the
     initial modules uses TLS.  This makes dynamic loading of modules with
     TLS impossible, but to support it requires either eagerly doing setup
     now or lazily doing it later.  Doing it now makes us incompatible with
     an old kernel that can't perform TLS_INIT_TP, even if no TLS is ever
     used.  Trying to do it lazily is too hairy to try when there could be
     multiple threads (from a non-TLS-using libpthread).  */
  bool was_tls_init_tp_called = tls_init_tp_called;
  if (tcbp == NULL)
    tcbp = init_tls ();

If I understand your point correctly, I should move the rseq_init() and
rseq_register_current_thread() for the SHARED case just after this
initialization, otherwise calling those from LIBC_START_MAIN() is too
late and it runs after initial modules constructors (or not at all for
Go). However, this means glibc will start using TLS internally. I'm
concerned that this is not quite in line with the above comment which
states that TLS is not initialized if no initial modules use TLS.

For the !SHARED use-case, if my understanding is correct, I should keep
rseq_init() and rseq_register_current_thread() calls within LIBC_START_MAIN().

Thoughts ?

Thanks for the feedback!

Mathieu



> 
> Thanks,
> Florian

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com