[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wih4BqW7GTLaYxewynuT-iFHrXroip0wNo0CyPtmYGUow@mail.gmail.com>
Date: Wed, 8 Apr 2020 09:34:47 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Waiman Long <longman@...hat.com>, Ingo Molnar <mingo@...nel.org>,
Will Deacon <will@...nel.org>,
Bernd Edlinger <bernd.edlinger@...mail.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexey Gladkov <gladkov.alexey@...il.com>
Subject: Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
On Wed, Apr 8, 2020 at 8:17 AM Eric W. Biederman <ebiederm@...ssion.com> wrote:
>
> Yes. I missed the fact that we could take the lock killable.
> We still unfortunately have the deadlock with ptrace.
That, I feel, is similarly trivial.
Again, anybody who takes the lock for writing should just do so
killably. So you have three cases:
- ptrace wins the race and gets the lock.
Fine, the execve will wait until afterwards.
- ptrace loses the race and is not a thread with execve.
Fine, the execve() won, and the ptrace will wait until after execve.
- ptrace loses the race and is a thread with execve.
Fine, the execve() will kill the thing in dethread() and the ptrace
thread will release the lock and die.
So all three cases are fine, and none of them have any behavioral
differences (as mentioned, the killing is "invisible" to users since
it's fundamentally a race, and you can consider the kill to have
happened before the ptrace started).
> It might be simpler to make whichever lock we are dealing with per
> task_struct instead of per signal_struct. Then we don't even have to
> think about what de_thread does or if the lock is taken killable.
Well, yes, but I think the dethread behavior of killing threads is
required anyway, so..
> I keep wondering if we could do something similar to vfork. That is
> allocate an new task_struct and fully set it up for the post exec
> process, and then make it visible under tasklist_lock. Finally we could
> free the old process.
>
> That would appear as if everything happened atomically from
> the point of view of the rest of the kernel.
I do think that would have been a lovely design originally, and would
avoid a lot of things. So "execve()" would basically look like an exit
and a thread creation with the same pid (without the SIGCHILD to the
parent, obviously)
That would also naturally handle the "flush pending signals" etc issues.
The fact that we created a whole new mm-struct ended up fixing a lot
of problems (even if it was painful to do). This might be similar.
But it's not what we've ever done, and I do suspect you'd run into a
lot of odd small special cases if we were to try to do it now.
So I think it's simpler to just start making the "cred lock waiters
have to be killable" rule. It's not like that's a very complex rule.
Linus
Powered by blists - more mailing lists