[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjE_YJ4SFJUasF=tW0jMr6zg+rkRNuNt2hdODPu_LLTVw@mail.gmail.com>
Date: Wed, 29 Apr 2020 20:25:45 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Jann Horn <jannh@...gle.com>
Cc: Bernd Edlinger <bernd.edlinger@...mail.de>,
Oleg Nesterov <oleg@...hat.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Waiman Long <longman@...hat.com>,
Ingo Molnar <mingo@...nel.org>, Will Deacon <will@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexey Gladkov <gladkov.alexey@...il.com>
Subject: Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
On Wed, Apr 29, 2020 at 8:00 PM Jann Horn <jannh@...gle.com> wrote:
>
> But if we go with Bernd's approach together with your restart
> suggestion,
So repeat after me: Bernd's approach _without_ the restart is unacceptable.
It's unacceptable because it breaks things that currently work, and
returns EAGAIN in situations where that is simple not a valid error
code.
His original patch also happens to be unacceptable because it's an
unmaintainable mess, but that's independent of the bug it introduced.
That bug has nothing to do with ptrace(). It's literally a "write()"
to a file in /proc.
What is so hard to get about this basic thing?
> then simply doing PTRACE_ATTACH on two threads A and B
> would be enough to livelock, right?
The same case that just causes a recursive wait. Yes. No worse off than we were.
And the fact is, *THAT* case looks truly trivial to work around.
Just make the ptrace() code - but not the fs/proc/base.c code - do
something like
if (lock_exec_creds(tsk))
return -EINTR;
and now ptrace() doesn't repeat (simply because it doesn't return that
ERESTARTNOINTR. It would go through that "return through signal
handling code" in the kernel, but it wouldn't actually retry the
system call).
But I'm getting less and less interested in trying to "fix" this
problem, when people don't seem to realize that the important case is
to not break _good_ programs, and the pointless buggy garbage
test-case is entirely uninteresting. It's buggy user code. If it
causes a wait or a livelock, nobody sane should care in the least. Fix
the bug in user space.
Introducing new bugs in the kernel where they didn't exist before - in
order to try to work around buggy user-space that has never ever
worked - is not acceptable.
Linus
Powered by blists - more mailing lists