linux-kernel - Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wjcyDkW_eQ_f4j82dci6jteSgOZ_ONRh9nzs9ykWELQng@mail.gmail.com>
Date:   Thu, 30 Apr 2020 06:37:51 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Jann Horn <jannh@...gle.com>
Cc:     Bernd Edlinger <bernd.edlinger@...mail.de>,
        Oleg Nesterov <oleg@...hat.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Waiman Long <longman@...hat.com>,
        Ingo Molnar <mingo@...nel.org>, Will Deacon <will@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Alexey Gladkov <gladkov.alexey@...il.com>
Subject: Re: [GIT PULL] Please pull proc and exec work for 5.7-rc1

On Wed, Apr 29, 2020 at 8:25 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Bernd's approach _without_ the restart is unacceptable.
>
> It's unacceptable because it breaks things that currently work, and
> returns EAGAIN in situations where that is simple not a valid error
> code.

Looking at my restart thing, I think it's a hack, and I don't think
that's acceptable either.

I was pleased with how clever it was, but it's one of those "clever
hacks" that is in the end more "hack" than "clever".

The basic issue is that releasing a lock in the middle just
fundamentally defeats the purpose of the lock unless you have a way to
redo the operation after fixing whatever caused the drop.

And the system call restart thing is dodgy, because there's none of
that "fixing".

It can cause that "write()" call to do the CPU busy loop too if it
hits that "execve() in process" situation.

The only difference with the "write()" case vs "ptrace()" is that
nobody has ever written an insane test-case that doesn't wait for
children, and then does a "write()" to the /proc file that can then
require zombie children to be reaped.

So I don't think the approach is valid even with the restart. Not
restarting isn't acceptable for write(), but restarting doesn't really
work either.

I guess we could have a very special lock that does something like

    int lock_exec_cred_mutex(struct task_struct *task)
    {
        if (mutex_trylock(&task->signal->cred_guard_mutex))
                return 0;

        if (lock_can_deadlock(task))
                return -EDEADLK;

        return mutex_lock_interruptible(&task->signal->cred_guard_mutex);
    }

might work. But that "lock_can_deadlock()" needs some kind of oracle
or heuristic.

And I can't come up with a perfect one, although I can come up with
things like "if the target has threads, and those threads have a
reaoer that is you, then you have to have SIGCHLD enabled". But it
gets ugly and hacky.

But I think actually releasing the lock in the middle of execve()
before it's done with is worse than ugly and hacky - it's
fundamentally broken.

Moving things around? Sure - like waiting for the threads _after_ the
lock and having done all the cred calculations. So I think Oleg's
patch works.

                 Linus