lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <87a75cgkb5.fsf@oldenburg2.str.redhat.com>
Date:   Fri, 21 Feb 2020 14:07:58 +0100
From:   Florian Weimer <fweimer@...hat.com>
To:     Chris Kennelly <ckennelly@...gle.com>
Cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Joel Fernandes\, Google" <joel@...lfernandes.org>,
        Paul Turner <pjt@...gle.com>,
        "Carlos O'Donell" <codonell@...hat.com>,
        libc-alpha <libc-alpha@...rceware.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        paulmck <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
        Brian Geffon <bgeffon@...gle.com>
Subject: Re: Rseq registration: Google tcmalloc vs glibc

Quoting in full to get this message to libc-alpha, past the text/html
filter.  I wish we had a different list configuration …

Please Note that we have integration patches for glibc which need
review.  A fair number of them have been written by me, so I can't help
with that.

* Chris Kennelly:

> On Thu, Feb 20, 2020 at 4:33 PM Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
> wrote:
>
>  Hi Chris,
>
>  As one of the maintainers of the Rseq system call in the Linux kernel, I would
>  like to thank the Google team for open sourcing a tcmalloc implementation based
>  on Rseq!
>
>  I've looked into some critical integration aspects of the tcmalloc implementation,
>  and would like to bring up a topic which involves both tcmalloc developers and the
>  glibc community.
>
> Thanks.  To answer the later questions first:
> * I implemented TCMalloc's upstream rseq based on the conventions I could find from
> (mostly) the kernel self tests.  This is probably why it looks like #1 :)
>
> This is less of an intentional preference and more of "it's important that early adopters follow
> a convention" for future glibc upgrades.  Initializing __rseq_abi is the most important, but
> there are some other ones, mostly for debugging/tracing
> (https://github.com/compudj/librseq/issues/1), that I'd like to get right too.
>
> * The TCMalloc project does not provide ABI stability, so TCMalloc can change the convention
> it follows.
>  
>  I have been discussing aspects of co-existence between early Rseq adopter libraries
>  and glibc for the past year with the glibc community, and tcmalloc happens to be the
>  first project to publicly use Rseq outside of prototype branches or selftests code.
>  Considering that there can only be one Rseq registration per thread (as imposed by
>  the rseq ABI), there needs to be some kind of protocol between libraries to ensure we
>  don't introduce regressions when we eventually combine a newer glibc which takes care
>  of registration of the __rseq_abi TLS along with tcmalloc which also try to perform
>  that registration within the same thread.
>
>  Throughout the various rounds of review of the glibc Rseq integration patch set,
>  there were a few solutions envisioned. Here is a brief history:
>
>  1) Introduce a __rseq_refcount TLS variable.
>
>  - Currently used by Linux tools/testing/selftests/rseq/rseq.c,
>  - Currently used by Google tcmalloc,
>  - Emitted by glibc as well my the original patchset (but was later removed),
>
>  A user incrementing the refcount from 0 -> 1 performs rseq registration.
>  The last user decrementing from 1 -> 0 performs rseq unregistration.
>
>  Works for co-existence of dlopen'd/dlclose'd libraries, for dynamically
>  linked libraries, and for the main executable.
>
>  The refcounting was deemed too complex for glibc's needs (it always
>  exists for the entire executable's lifetime), so we moved to
>  __rseq_handled instead.
>
>  2) Introduce a __rseq_handled global variable.
>
>  - Currently used by Linux tools/testing/selftests/rseq/rseq.c,
>  - At some point emitted by glibc as well in my patch set (but was later
>    removed),
>
>  A library may take rseq ownership if it is still 0 when executing the
>  library constructor. Set to 1 by library constructor when handling rseq.
>  Set to 0 in destructor if handling rseq.
>
>  Not meant to be set by dlopen'd/dlclose'd libraries, only by libraries
>  existing for the whole lifetime of the executable and/or the main executable.
>
>  This __rseq_handled symbol has been identified as being somewhat redundant
>  with the information provided in the __rseq_abi.cpu_id field (uninitialized
>  state), which motivated removing this symbol from the glibc integration
>  entirely. The only reason for having __rseq_handled separate from
>  __rseq_abi.cpu_id was because it was then impossible to touch TLS data
>  early in the glibc initialization. This issue was later resolved within
>  glibc.
>
>  3) Use the  __rseq_abi TLS cpu_id field to know whether Rseq has been
>  registered.
>
>  - Current protocol in the most recent glibc integration patch set.
>  - Not supported yet by Linux kernel rseq selftests,
>  - Not supported yet by tcmalloc,
>
>  Use the per-thread state to figure out whether each thread need to register
>  Rseq individually.
>
>  Works for integration between a library which exists for the entire lifetime
>  of the executable (e.g. glibc) and other libraries. However, it does not
>  allow a set of libraries which are dlopen'd/dlclose'd to co-exist without
>  having a library like glibc handling the registration present.
>
> Overall, I like #3 the most due to its simplicity, but I also do not need to support the
> dlopen/dlclose use case (below).
>  
>  So overall, I suspect the protocol we want for early adopters is that they
>  only register Rseq if __rseq_abi.cpu_id == RSEQ_CPU_ID_UNINITIALIZED, which
>  ensure they do not get -1, errno = EBUSY when linked against a newer glibc
>  which handles Rseq registration. In order to handle multiple early adopters
>  dlopen'd/dlclose'd in the same executable, those should synchronize with a
>  __rseq_refcount TLS reference count, but it does not have to be taken into
>  account by the main executable or libraries present for the entire executable
>  lifetime (like glibc). 
>
>  Based on this, what I think would be missing from the current Google tcmalloc
>  implementation is a check for __rseq_abi.cpu_id == RSEQ_CPU_ID_UNINITIALIZED
>  in InitThreadPerCpu().
>
> TCMalloc does not get to InitThreadPerCpu without that check.
>
> Before initialization happens, we
> * end up on a slow path
> https://github.com/google/tcmalloc/blob/master/tcmalloc/tcmalloc.cc#L1486
> * which checks UsePerCpuCache
> https://github.com/google/tcmalloc/blob/master/tcmalloc/cpu_cache.h#L222
> * and inspects the TLS variable in IsFast
> https://github.com/google/tcmalloc/blob/master/tcmalloc/internal/percpu.h#L171
> ...which triggers per-thread rseq registration if __rseq_abi.cpu_id is uninitialized.
>
> Otherwise, the IsOnFastPath() call checks also inspects __rseq_abi.cpu_id via IsFastNoInit
> (same thing, but no registration triggered).
>
>  Is tcmalloc ever meant to be dlopen'd/dlclose'd (either directly or indirectly),
>  or is it required to exist for the entire executable lifetime ? The check and
>  increment of __rseq_refcount is only useful to co-exist with dlopen'd/dlclose'd
>  libraries, but it would not allow discovering the presence of a glibc which
>  takes care of the rseq registration with the planned protocol. A dlopen'd
>  library should then only perform rseq unregistration if if brings the
>  __rseq_refcount back to 0 (e.g. in a pthread_key destructor).
>
> TCMalloc cannot practically be dlopen'd or dlclose'd.
> * Once memory is allocated with one instance of malloc (or operator new), it needs to be
> free'd to the same heap.
> * dlclose is explicitly not supported by our dependencies ("do not rely on dynamic
> unloading" https://abseil.io/about/compatibility)
>
> Thanks,
> Chris
>
>  Adding this check for __rseq_abi.cpu_id == RSEQ_CPU_ID_UNINITIALIZED is something
>  I need to do in the Linux rseq selftests, but I refrained from submitting any
>  further change to those tests until the glibc rseq integration gets finally
>  merged.
>
>  Is it something that could be easily changed at this stage in Google tcmalloc,
>  or should we reconsider adding back __rseq_refcount within the glibc integration
>  patch set, even though it is not strictly useful to glibc ?
>
>  Thanks,
>
>  Mathieu
>
>  -- 
>  Mathieu Desnoyers
>  EfficiOS Inc.
>  http://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ