linux-kernel - Re: [PATCH] exec: Fix a deadlock in ptrace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200301182014.emo34zwv5vjaydft@wittgenstein>
Date:   Sun, 1 Mar 2020 19:20:14 +0100
From:   Christian Brauner <christian.brauner@...ntu.com>
To:     Bernd Edlinger <bernd.edlinger@...mail.de>
Cc:     Aleksa Sarai <cyphar@...har.com>, Oleg Nesterov <oleg@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexey Dobriyan <adobriyan@...il.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Frederic Weisbecker <frederic@...nel.org>,
        Andrei Vagin <avagin@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Yuyang Du <duyuyang@...il.com>,
        David Hildenbrand <david@...hat.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Anshuman Khandual <anshuman.khandual@....com>,
        David Howells <dhowells@...hat.com>,
        Jann Horn <jannh@...gle.com>,
        James Morris <jamorris@...ux.microsoft.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Christian Kellner <christian@...lner.me>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Dmitry V. Levin" <ldv@...linux.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH] exec: Fix a deadlock in ptrace

On Sun, Mar 01, 2020 at 05:46:08PM +0000, Bernd Edlinger wrote:
> On 3/1/20 4:58 PM, Christian Brauner wrote:
> > On Mon, Mar 02, 2020 at 02:13:33AM +1100, Aleksa Sarai wrote:
> >> On 2020-03-01, Bernd Edlinger <bernd.edlinger@...mail.de> wrote:
> >>> This fixes a deadlock in the tracer when tracing a multi-threaded
> >>> application that calls execve while more than one thread are running.
> >>>
> >>> I observed that when running strace on the gcc test suite, it always
> >>> blocks after a while, when expect calls execve, because other threads
> >>> have to be terminated.  They send ptrace events, but the strace is no
> >>> longer able to respond, since it is blocked in vm_access.
> >>>
> >>> The deadlock is always happening when strace needs to access the
> >>> tracees process mmap, while another thread in the tracee starts to
> >>> execve a child process, but that cannot continue until the
> >>> PTRACE_EVENT_EXIT is handled and the WIFEXITED event is received:
> >>>
> >>> strace          D    0 30614  30584 0x00000000
> >>> Call Trace:
> >>> __schedule+0x3ce/0x6e0
> >>> schedule+0x5c/0xd0
> >>> schedule_preempt_disabled+0x15/0x20
> >>> __mutex_lock.isra.13+0x1ec/0x520
> >>> __mutex_lock_killable_slowpath+0x13/0x20
> >>> mutex_lock_killable+0x28/0x30
> >>> mm_access+0x27/0xa0
> >>> process_vm_rw_core.isra.3+0xff/0x550
> >>> process_vm_rw+0xdd/0xf0
> >>> __x64_sys_process_vm_readv+0x31/0x40
> >>> do_syscall_64+0x64/0x220
> >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >>>
> >>> expect          D    0 31933  30876 0x80004003
> >>> Call Trace:
> >>> __schedule+0x3ce/0x6e0
> >>> schedule+0x5c/0xd0
> >>> flush_old_exec+0xc4/0x770
> >>> load_elf_binary+0x35a/0x16c0
> >>> search_binary_handler+0x97/0x1d0
> >>> __do_execve_file.isra.40+0x5d4/0x8a0
> >>> __x64_sys_execve+0x49/0x60
> >>> do_syscall_64+0x64/0x220
> >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >>>
> >>> The proposed solution is to have a second mutex that is
> >>> used in mm_access, so it is allowed to continue while the
> >>> dying threads are not yet terminated.
> >>>
> >>> I also took the opportunity to improve the documentation
> >>> of prepare_creds, which is obviously out of sync.
> >>>
> >>> Signed-off-by: Bernd Edlinger <bernd.edlinger@...mail.de>
> >>
> >> I can't comment on the validity of the patch, but I also found and
> >> reported this issue in 2016[1] and the discussion quickly veered into
> >> the problem being more complicated (and uglier) than it seems at first
> >> glance.
> >>
> >> You should probably also Cc stable, given this has been a long-standing
> >> issue and your patch doesn't look (too) invasive.
> >>
> >> [1]: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/
> > 
> > Yeah, I remember you mentioning this a while back.
> > 
> > Bernd, we really want a reproducer for this sent alongside with this
> > patch added to:
> > tools/testing/selftests/ptrace/
> > Having a test for this bug irrespective of whether or not we go with
> > this as fix seems really worth it.
> > 
> 
> I ran into this issue, because I wanted to fix an issue in the gcc testsuite,
> namely why it forgets to remove some temp files,
> so I did the following:
> 
> strace -ftt -o trace.txt make check-gcc-c -k -j4
> 
> I reproduced with v4.20 and v5.5 kernel, and I don't know why but it is
> not happening on all systems I tested, maybe it is something that the expect program
> does, because, always when I try to reproduce this, the deadlock was always in "expect".
> 
> I use expect version 5.45 on the computer where the above test freezes after
> a couple of minutes.
> 
> I think the issue with strace is that it is using vm_access to get the parameters
> of a syscall that is going on in one thread, and that races with another thread
> that calls execve, and blocks the cred_guard_mutex.
> 
> While Olg's test case here, will certainly not be fixed:
> 
> https://lore.kernel.org/lkml/20160923095031.GA14923@redhat.com/
> 
> he mentions the access to "anything else which needs ->cred_guard_mutex,
> say open(/proc/$pid/mem)", I don't know for sure how that can be done, but if
> that is possible, it would probably work as a test case.
> 
> What do you think?

Yeah, anything that calls ptrace_may_access() is fine and
open(/proc/$pid/mem) will work so long as $pid is not in the same
thread-group as the caller. A polished version of the reproducer you
linked in would probably be good.