[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPNVh5fiCCJpyeLj_ciWzFrO4fasVXZNhpfKXJhJWJirXdJOjQ@mail.gmail.com>
Date: Tue, 14 Jul 2020 11:33:44 -0700
From: Peter Oskolkov <posk@...gle.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Peter Oskolkov <posk@...k.io>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
paulmck <paulmck@...ux.ibm.com>,
Boqun Feng <boqun.feng@...il.com>,
"H. Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
linux-api <linux-api@...r.kernel.org>,
Christian Brauner <christian.brauner@...ntu.com>,
Florian Weimer <fw@...eb.enyo.de>, carlos <carlos@...hat.com>,
Chris Kennelly <ckennelly@...gle.com>
Subject: Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
On Tue, Jul 14, 2020 at 10:43 AM Mathieu Desnoyers
<mathieu.desnoyers@...icios.com> wrote:
>
> ----- On Jul 14, 2020, at 1:24 PM, Peter Oskolkov posk@...k.io wrote:
>
> > At Google, we actually extended struct rseq (I will post the patches
> > here once they are fully deployed and we have specific
> > benefits/improvements to report). We did this by adding several fields
> > below __u32 flags (the last field currently), and correspondingly
> > increasing rseq_len in rseq() syscall. If the kernel does not know of
> > this extension, it will return -EINVAL due to an unexpected rseq_len;
> > then the application can either fall-back to the standard/upstream
> > rseq, or bail. If the kernel does know of this extension, it accepts
> > it. If the application passes the old rseq_len (32), the kernel knows
> > that this is an old application and treats it as such.
> >
> > I looked through the archives, but I did not find specifically why the
> > pretty standard approach described above is considered inferior to the
> > one taken in this patch (freeze rseq_len at 32, add additional length
> > fields to struct rseq). Can these be summarized?
>
> I think you don't face the issues I'm facing with libc rseq integration
> because you control the entire user-space software ecosystem at Google.
>
> The main issue we face is that the library responsible for registering
> rseq (either glibc 2.32+, an early-adopter librseq library, or the
> application) may very well not be the same library defining the __rseq_abi
> symbol used in the global symbol table. Interposition with ld preload or
> by defining the __rseq_abi in the program's executable are good examples
> of this kind of scenario, and those use-cases are supported.
>
> So the size of the __rseq_abi structure may be larger than the struct
> rseq known by glibc (and eventually smaller, if future glibc versions
> extend their __rseq_abi size but is loaded with an older program/library
> doing __rseq_abi interposition).
>
> So we need some way to allow code defining the __rseq_abi to let the kernel
> know how much room is available, without necessarily requiring the code
> responsible for rseq registration to be aware of that extended layout.
> This is the purpose of the __rseq_abi.flags RSEQ_FLAG_TLS_SIZE and field
> __rseq_abi.user_size.
>
> And we need some way to allow the kernel to let user-space rseq critical
> sections (user code) know how much of those fields are actually populated
> by the kernel. This is the purpose of __rseq_abi.flags RSEQ_FLAG_TLS_SIZE
> with __rseq_abi.kernel_size.
Thanks, Mathieu, for the explanation. Yes, multiple unrelated
libraries having to share struct rseq complicates matters. Your
approach appears to be a way to reconcile the issues you outlined
above.
Thanks,
Peter
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
Powered by blists - more mailing lists