[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v9maxb5q.fsf@x220.int.ebiederm.org>
Date: Wed, 08 Apr 2020 10:14:09 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Waiman Long <longman@...hat.com>, Ingo Molnar <mingo@...nel.org>,
Will Deacon <will@...nel.org>,
Bernd Edlinger <bernd.edlinger@...mail.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexey Gladkov <gladkov.alexey@...il.com>
Subject: Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1
Linus Torvalds <torvalds@...ux-foundation.org> writes:
> On Mon, Apr 6, 2020 at 3:20 PM Eric W. Biederman <ebiederm@...ssion.com> wrote:
>>
>> But fundamentally the only reason we need this information stable
>> before the point of no return is so that we can return a nice error
>> code to the process calling exec. Instead of terminating the
>> process with SIGSEGV.
>
> I'd suggest doing it the other way around instead: let the thread that
> does the security_setprocattr() die, since execve() is terminating
> other threads anyway.
>
> And the easy way to do that is to just make the rule be that anybody
> who waits for this thing for write needs to use a killable wait.
>
> So if the execve() got started earlier, and already took the cred lock
> (whatever we'll call it) for reading, then zap_other_threads() will
> take care of another thread doing setprocattr().
>
> That sounds like a really simple model, no?
Yes. I missed the fact that we could take the lock killable.
We still unfortunately have the deadlock with ptrace.
It might be simpler to make whichever lock we are dealing with per
task_struct instead of per signal_struct. Then we don't even have to
think about what de_thread does or if the lock is taken killable.
Looking at the code in binfmt_elf.c there are about 11 other places
after install_exec_creds where we can fail and would be forced to
terminate the application with SIGSEGV instead of causing fork to fail.
I keep wondering if we could do something similar to vfork. That is
allocate an new task_struct and fully set it up for the post exec
process, and then make it visible under tasklist_lock. Finally we could
free the old process.
That would appear as if everything happened atomically from
the point of view of the rest of the kernel.
As well as fixing all of the deadlocks and making it easy
to ensure we don't have any more weird failures in the future.
Eric
p.s. For tasklist_lock I suspect we can put a lock in struct pid
and use that to guard the task lists in struct pid. Which would
allow for tasklist_lock to be take much less. Then we would
just need a solution for task->parent and task->real_parent and
I think all of the major users of tasklist_lock would be gone.
Powered by blists - more mailing lists