linux-kernel - Re: [PATCH 0/2] Infrastructure to allow fixing exec deadlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87zhcuax8b.fsf@x220.int.ebiederm.org>
Date:   Thu, 05 Mar 2020 23:06:12 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Bernd Edlinger <bernd.edlinger@...mail.de>
Cc:     Christian Brauner <christian.brauner@...ntu.com>,
        Kees Cook <keescook@...omium.org>,
        Jann Horn <jannh@...gle.com>, Jonathan Corbet <corbet@....net>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Oleg Nesterov <oleg@...hat.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Andrei Vagin <avagin@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        "Peter Zijlstra \(Intel\)" <peterz@...radead.org>,
        Yuyang Du <duyuyang@...il.com>,
        David Hildenbrand <david@...hat.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Anshuman Khandual <anshuman.khandual@....com>,
        David Howells <dhowells@...hat.com>,
        James Morris <jamorris@...ux.microsoft.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Christian Kellner <christian@...lner.me>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Aleksa Sarai <cyphar@...har.com>,
        "Dmitry V. Levin" <ldv@...linux.org>,
        "linux-doc\@vger.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-fsdevel\@vger.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-mm\@kvack.org" <linux-mm@...ck.org>,
        "stable\@vger.kernel.org" <stable@...r.kernel.org>,
        "linux-api\@vger.kernel.org" <linux-api@...r.kernel.org>
Subject: Re: [PATCH 0/2] Infrastructure to allow fixing exec deadlocks

Bernd Edlinger <bernd.edlinger@...mail.de> writes:

> On 3/5/20 10:14 PM, Eric W. Biederman wrote:
>> 
>> Bernd, everyone
>> 
>> This is how I think the infrastructure change should look that makes way
>> for fixing this issue.
>> 
>> - Correct the point of no return.
>> - Add a new mutex to replace cred_guard_mutex
>> 
>> Then I think it is just going through the existing
>> users of cred_guard_mutex and fixing them to use the new one.
>> 
>> There really aren't that many users of cred_guard_mutex so we should be
>> able to get through the easy ones fairly quickly.  And anything that
>> isn't easy we can wait until we have a good fix.
>> 
>> The users of cred_guard_mutex that I saw were:
>>     fs/proc/base.c:
>>        proc_pid_attr_write
>>        do_io_accounting
>>        proc_pid_stack
>>        proc_pid_syscall
>>        proc_pid_personality
>>     
>>     perf_event_open
>>     mm_access
>>     kcmp
>>     pidfd_fget
>>     seccomp_set_mode_filter
>> 
>> Bernd does this make sense to you?  
>> 
>> I think we can fix the seccomp/no_new_privs issue with some careful
>> refactoring.  We can probably do the same for ptrace but that appears
>> to need a little lsm bug fixing.
>> 
>
> Yes, for most functions the proposed "exec_update_mutex" is fine,
> but we will need a longer-time block for ptrace_attach, seccomp_set_mode_filter
> and proc_pid_attr_write need to be blocked for the whole exec duration so
> they need a second "mutex", with deadlock-detection as in my previous patch,
> if I see that right.

So far I am leaving "cred_guard_mutex" as that second "mutex".  My sense
is that when all we have left are the hard cases we can take those
cases out in detail, examine them and see what really can be done.

> Unfortunately only one of the two test cases can be fixed without the
> second mutex, of course the mm_access is what cause the practical problem.

Fixing the practical problems are foremost on my agenda.
That and clearing away enough of the noise that we can really focus on
the hard problems when we begin to address them.

That way I am hoping we can really solve some of these issues and make
them go away.

> Currently for the unlimited user space delay, I have only the case of
> a ptraced sibling thread on my radar, de_thread waits for the parent
> to call wait in this case, that can literally take forever.
> But I know that also PTRACE_CONT may be needed after a PTRACE_EVENT_EXIT.
>
> Can you explain what else in the user space can go wrong to make an
> unlimited delay in the execve?

Triggering a page fault.  Depending on the backing store or possibly
with the use of userfaultfd that page fault can be delayed indefinitely
and pretty much be as bad as the ptrace case.

Eric