[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1339477886.25835.1643750440726.JavaMail.zimbra@efficios.com>
Date: Tue, 1 Feb 2022 16:20:40 -0500 (EST)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Florian Weimer <fw@...eb.enyo.de>
Cc: Peter Zijlstra <peterz@...radead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
paulmck <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
"H. Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
linux-api <linux-api@...r.kernel.org>,
Christian Brauner <christian.brauner@...ntu.com>,
David Laight <David.Laight@...LAB.COM>,
carlos <carlos@...hat.com>, Peter Oskolkov <posk@...k.io>
Subject: Re: [RFC PATCH 2/3] rseq: extend struct rseq with per thread group
vcpu id
----- On Feb 1, 2022, at 3:32 PM, Florian Weimer fw@...eb.enyo.de wrote:
[...]
>
>>> Is the switch really useful? I suspect it's faster to just write as
>>> much as possible all the time. The switch should be well-predictable
>>> if running uniform userspace, but still …
>>
>> The switch ensures the kernel don't try to write to a memory area beyond
>> the rseq size which has been registered by user-space. So it seems to be
>> useful to ensure we don't corrupt user-space memory. Or am I missing your
>> point ?
>
> Due to the alignment, I think you'd only ever see 32 and 64 bytes for
> now?
Yes, but I would expect the rseq registration arguments to have a rseq_len
of offsetofend(struct rseq, tg_vcpu_id) when userspace wants the tg_vcpu_id
feature to be supported (but not the following features).
Then, as we append additional features as follow-up fields, those
eventually become requested by glibc by increasing the requested size.
Then it's kind of weird to receive a registration size which is not
aligned on 32-byte, but then use internal knowledge of the structure
alignment in the kernel code to write beyond the requested size. And all
this in a case where we are returning to user-space after a preemption,
so I don't expect this extra switch/case to cause significant overhead.
>
> I'd appreciate if you could put the maximm supported size and possibly
> the alignment in the auxiliary vector, so that we don't have to rseq
> system calls in a loop on process startup.
Yes, it's a good idea. I'm not too familiar with the auxiliary vector.
Are we talking about the kernel's
fs/binfmt_elf.c:fill_auxv_note()
?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists