[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87eed4v2dc.fsf@disp2133>
Date: Mon, 14 Jun 2021 11:26:39 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Michael Schmitz <schmitzmic@...il.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
linux-arch <linux-arch@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>, Oleg Nesterov <oleg@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Richard Henderson <rth@...ddle.net>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>,
Matt Turner <mattst88@...il.com>,
alpha <linux-alpha@...r.kernel.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
linux-m68k <linux-m68k@...ts.linux-m68k.org>,
Arnd Bergmann <arnd@...nel.org>,
Ley Foon Tan <ley.foon.tan@...el.com>,
Tejun Heo <tj@...nel.org>, Kees Cook <keescook@...omium.org>
Subject: Re: Kernel stack read with PTRACE_EVENT_EXIT and io_uring threads
Michael Schmitz <schmitzmic@...il.com> writes:
> On second thought, I'm not certain what adding another empty stack frame would
> achieve here.
>
> On m68k, 'frame' already is a new stack frame, for running the new thread
> in. This new frame does not have any user context at all, and it's explicitly
> wiped anyway.
>
> Unless we save all user context on the stack, then push that context to a new
> save frame, and somehow point get_signal to look there for IO threads
> (essentially what Eric suggested), I don't see how this could work?
>
> I must be missing something.
It is only designed to work well enough so that ptrace will access
something well defined when ptrace accesses io_uring tasks.
The io_uring tasks are special in that they are user process
threads that never run in userspace. So as long as everything
ptrace can read is accessible on that process all is well.
Having stared a bit longer at the code I think the short term
fix for both of PTRACE_EVENT_EXIT and io_uring is to guard
them both with CONFIG_HAVE_ARCH_TRACEHOOK.
Today CONFIG_HAVE_ARCH_TRACEHOOK guards access to /proc/self/syscall.
Which out of necessity ensures that user context is always readable.
Which seems to solve both the PTRACE_EVENT_EXIT and the io_uring
problems.
What I especially like about that is there are a lot of other reasons
to encourage architectures in a CONFIG_HAVE_ARCH_TRACEHOOK direction.
I think the biggies are getting architectures to store the extra
saved state on context switch into some place in task_struct
and to implement the regset view of registers.
Hmm. This is odd. CONFIG_HAVE_ARCH_TRACEHOOK is supposed to imply
CORE_DUMP_USE_REGSET. But alpha, csky, h8300, m68k, microblaze, nds32
don't implement CORE_DUMP_USE_REGSET but nds32 implements
CONFIG_ARCH_HAVE_TRACEHOOK.
I will keep digging and see what clean code I can come up with.
Eric
Powered by blists - more mailing lists