[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200708162247.txdleelcalxkrfjy@wittgenstein>
Date: Wed, 8 Jul 2020 18:22:47 +0200
From: Christian Brauner <christian.brauner@...ntu.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Florian Weimer <fw@...eb.enyo.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
carlos <carlos@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
linux-kernel <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
paulmck <paulmck@...ux.ibm.com>,
Boqun Feng <boqun.feng@...il.com>,
"H. Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
linux-api <linux-api@...r.kernel.org>,
Dmitry Vyukov <dvyukov@...gle.com>,
Neel Natu <neelnatu@...gle.com>
Subject: Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID
On Wed, Jul 08, 2020 at 11:33:51AM -0400, Mathieu Desnoyers wrote:
> [ Context for Linus: I am dropping this RFC patch, but am curious to
> hear your point of view on exposing to user-space which system call
> behavior fixes are present in the kernel, either through feature
> flags or system-call versioning. The intent is to allow user-space
> to make better decisions on whether it should use a system call or
> rely on fallback behavior. ]
>
> ----- On Jul 7, 2020, at 3:55 PM, Florian Weimer fw@...eb.enyo.de wrote:
>
> > * Carlos O'Donell:
> >
> >> It's not a great fit IMO. Just let the kernel version be the arbiter of
> >> correctness.
> >
> > For manual review, sure. But checking it programmatically does not
> > yield good results due to backports. Even those who use the stable
> > kernel series sometimes pick up critical fixes beforehand, so it's not
> > reliable possible for a program to say, “I do not want to run on this
> > kernel because it has a bad version”. We had a recent episode of this
> > with the Go runtime, which tried to do exactly this.
>
> FWIW, the kernel fix backport issue would also be a concern if we exposed
> a numeric "fix level version" with specific system calls: what should
> we do if a distribution chooses to include one fix in the sequence,
> but not others ? Identifying fixes are "feature flags" allow
> cherry-picking specific fixes in a backport, but versions would not
> allow that.
>
> That being said, maybe it's not such a bad thing to _require_ the
> entire series of fixes to be picked in backports, which would be a
> fortunate side-effect of the per-syscall-fix-version approach.
>
> But I'm under the impression that such a scheme ends up versioning
> a system call, which I suspect will be a no-go from Linus' perspective.
I've been following this a little bit. The kernel version itself doesn't
really mean anything and the kernel version is imho not at all
interesting to userspace applications. Especially for cross-distro
programs. We can't go around and ask Red Hat, SUSE, Ubuntu, Archlinux,
openSUSE and god knows who what other distro what their fixed kernel
version is. That's not feasible at all and not how must programs do it.
Sure, a lot of programs name a minimal kernel version they require but
realistically we can't keep bumping it all the time. So the best
strategy for userspace imho has been to introduce a re-versioned flag or
enum that indicates the fixed behavior.
So I would suggest to just introduce
RSEQ_FLAG_REGISTER_2 = (1 << 2),
that's how these things are usually done (Netlink etc.). So not
introducing a fix bit or whatever but simply reversion your flag/enum.
We already deal with this today.
(Also, as a side-note. I see that you're passing struct rseq *rseq with
a length argument but you are not versioning by size. Is that
intentional? That basically somewhat locks you to the current struct
rseq layout and means users might run into problems when you extend
struct rseq in the future as they can't pass the new struct down to
older kernels. The way we deal with this is now - rseq might preceed
this - is copy_struct_from_user() (for example in sched_{get,set}attr(),
openat2(), bpf(), clone3(), etc.). Maybe you want to switch to that to
keep rseq extensible? Users can detect the new rseq version by just
passing a larger struct down to the kernel with the extra bytes set to 0
and if rseq doesn't complain they know they're dealing with an rseq that
knows larger struct sizes. Might be worth it if you have any reason to
belive that struct rseq might need to grow.)
Christian
Powered by blists - more mailing lists