[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <775688146.12145.1594748580461.JavaMail.zimbra@efficios.com>
Date: Tue, 14 Jul 2020 13:43:00 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Peter Oskolkov <posk@...k.io>
Cc: Peter Zijlstra <peterz@...radead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
paulmck <paulmck@...ux.ibm.com>,
Boqun Feng <boqun.feng@...il.com>,
"H. Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
linux-api <linux-api@...r.kernel.org>,
Christian Brauner <christian.brauner@...ntu.com>,
Florian Weimer <fw@...eb.enyo.de>, carlos <carlos@...hat.com>,
Peter Oskolkov <posk@...gle.com>
Subject: Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
----- On Jul 14, 2020, at 1:24 PM, Peter Oskolkov posk@...k.io wrote:
> At Google, we actually extended struct rseq (I will post the patches
> here once they are fully deployed and we have specific
> benefits/improvements to report). We did this by adding several fields
> below __u32 flags (the last field currently), and correspondingly
> increasing rseq_len in rseq() syscall. If the kernel does not know of
> this extension, it will return -EINVAL due to an unexpected rseq_len;
> then the application can either fall-back to the standard/upstream
> rseq, or bail. If the kernel does know of this extension, it accepts
> it. If the application passes the old rseq_len (32), the kernel knows
> that this is an old application and treats it as such.
>
> I looked through the archives, but I did not find specifically why the
> pretty standard approach described above is considered inferior to the
> one taken in this patch (freeze rseq_len at 32, add additional length
> fields to struct rseq). Can these be summarized?
I think you don't face the issues I'm facing with libc rseq integration
because you control the entire user-space software ecosystem at Google.
The main issue we face is that the library responsible for registering
rseq (either glibc 2.32+, an early-adopter librseq library, or the
application) may very well not be the same library defining the __rseq_abi
symbol used in the global symbol table. Interposition with ld preload or
by defining the __rseq_abi in the program's executable are good examples
of this kind of scenario, and those use-cases are supported.
So the size of the __rseq_abi structure may be larger than the struct
rseq known by glibc (and eventually smaller, if future glibc versions
extend their __rseq_abi size but is loaded with an older program/library
doing __rseq_abi interposition).
So we need some way to allow code defining the __rseq_abi to let the kernel
know how much room is available, without necessarily requiring the code
responsible for rseq registration to be aware of that extended layout.
This is the purpose of the __rseq_abi.flags RSEQ_FLAG_TLS_SIZE and field
__rseq_abi.user_size.
And we need some way to allow the kernel to let user-space rseq critical
sections (user code) know how much of those fields are actually populated
by the kernel. This is the purpose of __rseq_abi.flags RSEQ_FLAG_TLS_SIZE
with __rseq_abi.kernel_size.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists