[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wh82uJ5Poqby3brn-D7xWbCMnGv-JnwfO0tuRfCvsVgXA@mail.gmail.com>
Date: Mon, 21 Jun 2021 16:14:36 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Michael Schmitz <schmitzmic@...il.com>,
linux-arch <linux-arch@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>, Oleg Nesterov <oleg@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Richard Henderson <rth@...ddle.net>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>,
Matt Turner <mattst88@...il.com>,
alpha <linux-alpha@...r.kernel.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
linux-m68k <linux-m68k@...ts.linux-m68k.org>,
Arnd Bergmann <arnd@...nel.org>,
Ley Foon Tan <ley.foon.tan@...el.com>,
Tejun Heo <tj@...nel.org>, Kees Cook <keescook@...omium.org>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Subject: Re: Kernel stack read with PTRACE_EVENT_EXIT and io_uring threads
On Mon, Jun 21, 2021 at 12:45 PM Al Viro <viro@...iv.linux.org.uk> wrote:
> >
> > Looks like sys_exit() and do_group_exit() would be the two places to
> > do it (do_group_exit() would handle the signal case and
> > sys_group_exit()).
>
> Maybe... I'm digging through that pile right now, will follow up when
> I get a reasonably complete picture
We might have another possible way to solve this:
(a) make it the rule that everybody always saves the full (integer)
register set in pt_regs
(b) make m68k just always create that switch-stack for all system
calls (it's really not that big, I think it's like six words or
something)
(c) admit that alpha is broken, but nobody really cares
> In the meanwhile, do kernel/kthread.c uses look even remotely sane?
> Intentional - sure, but it really looks wrong to use thread exit code
> as communication channel there...
I really doubt that it is even "intentional".
I think it's "use some errno as a random exit code" and nobody ever
really thought about it, or thought about how that doesn't really
work. People are used to the error numbers, not thinking about how
do_exit() doesn't take an error number, but a signal number (and an
8-bit positive error code in bits 8-15).
Because no, it's not even remotely sane.
I think the do_exit(-EINTR) could be do_exit(SIGINT) and it would make
more sense. And the -ENOMEM might be SIGBUS, perhaps.
It does look like the usermode-helper code does save the exit code
with things like
kernel_wait(pid, &sub_info->retval);
and I see call_usermodehelper_exec() doing
retval = sub_info->retval;
and treating it as an error code. But I think those have never been
tested with that (bogus) exit code thing from kernel_wait(), because
it wouldn't have worked. It has only ever been tested with the (real)
exit code things like
if (pid < 0) {
sub_info->retval = pid;
which does actually assign a negative error code to it.
So I think that
kernel_wait(pid, &sub_info->retval);
line is buggy, and should be something like
int wstatus;
kernel_wait(pid, &wstatus);
sub_info->retval = WEXITSTATUS(wstatus) ? -EINVAL : 0;
or something.
Linus
Powered by blists - more mailing lists