linux-kernel - Re: Restartable Sequences system call merged into Linux

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <417742741.11550.1528821084084.JavaMail.zimbra@efficios.com>
Date:   Tue, 12 Jun 2018 12:31:24 -0400 (EDT)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Florian Weimer <fweimer@...hat.com>
Cc:     carlos <carlos@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        libc-alpha <libc-alpha@...rceware.org>
Subject: Re: Restartable Sequences system call merged into Linux

----- On Jun 12, 2018, at 9:11 AM, Florian Weimer fweimer@...hat.com wrote:

> On 06/11/2018 10:04 PM, Mathieu Desnoyers wrote:
>> ----- On Jun 11, 2018, at 3:55 PM, Florian Weimer fweimer@...hat.com wrote:
>> 
>>> On 06/11/2018 09:49 PM, Mathieu Desnoyers wrote:
>>>> It should be noted that there can be only one rseq TLS area registered per
>>>> thread,
>>>> which can then be used by many libraries and by the executable, so this is a
>>>> process-wide (per-thread) resource that we need to manage carefully.
>>>
>>> Is it possible to resize the area after thread creation, perhaps even
>>> from other threads?
>> 
>> I'm not sure why we would want to resize it. The per-thread area is fixed-size.
>> Its layout is here: include/uapi/linux/rseq.h: struct rseq
> 
> Looks I was mistaken and this is very similar to the robust mutex list.
> 
> Should we treat it the same way?  Always allocate it for each new thread
> and register it with the kernel?

That would be an efficient way to do it, indeed. There is very little
performance overhead to have rseq registered for all threads, whether or
not they intend to run rseq critical sections.

> 
>> The ABI is designed so that all users (program and libraries) can interact
>> through this per-thread TLS area.
> 
> Then the user code needs just the address of the structure.

Yes.

> 
> How much coordination is needed between different users of this
> interface?  Looking at the the section hacks, I don't think we want to
> put this into glibc at this stage.  It looks more like something for
> which we traditionally require compiler support.

I really don't mind maintaining a separate project containing librseq
along with the headers needed to facilitate declaration of rseq critical
sections. This specifically does not need much coordination between users of
the interface.

The part which really requires coordination between users is registration
to the kernel (and ownership) of the rseq TLS area.

I have a few possible approaches in mind (feel free to suggest other
options):

A) glibc exposes a strong __rseq_abi TLS symbol:

   - should ideally *not* be global-dynamic for performance reasons,
   - registration to kernel can either be handled explicitly by requiring
     application or libraries to call an API, or implicitly at thread
     creation,
   - requires all rseq users to upgrade to newer glibc. Early rseq users
     (libs and applications) registering their own rseq TLS will conflict
     with newer glibc.

B) librseq.so exposes a strong __rseq_abi symbol:

   - should ideally *not* be global-dynamic for performance reasons, but
     testing shows that using initial-exec causes issues in situations where
     librseq.so ends up being dlopen'd (e.g. java virtual machine dlopening
     the lttng-ust tracer linked against librseq.so),
   - registration/unregistration of area to kernel can either be performed
     lazily on first use, destruction done using pthread_key, or require an
     explicit API call from application,
   - A per-thread refcount in a TLS could allow many users to call the
     registration/unregistration API, and lazy registration,
   - an early-user application which also exposes a __rseq_abi strong symbol
     would conflict with librseq.so.

C) __rseq_abi symbol declared weak within each user (application, librseq,
   other libraries, glibc):

   - should ideally *not* be global-dynamic for performance reasons,
     - however, initial-exec causes issues when librseq or early user libraries
       are dlopen'd (e.g. java runtime dlopening lttng-ust),
   - a weak symbol allow combining early user libs/apps with glibc/librseq
     exposing the same symbol,
   - considering that glibc is AFAIK never dlopen'd, does not cause exhaustion
     of initial-exec TLS entries in cases where librseq.so or early adopter
     libs are dlopen'd,
   - if glibc implicitly registers the rseq area, *and* librseq.so also wants
     to register it, *and* early adopters also want to register it, we should
     come up with a refcount scheme in the TLS ensuring that registration and
     unregistration is only done with the first/last user comes/goes away.

Thoughts ?

Thanks!

Mathieu

> 
> Thanks,
> Florian

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com