linux-kernel - Re: fs: uninterruptible hang in handle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFyqy2JYoVe_hQhE7gBRZqE7C9qn_ZYJ_B1g9sTQQF6OxQ@mail.gmail.com>
Date:	Tue, 1 Mar 2016 11:56:22 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Dmitry Vyukov <dvyukov@...gle.com>
Cc:	Alexander Viro <viro@...iv.linux.org.uk>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Pavel Emelyanov <xemul@...allels.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	syzkaller <syzkaller@...glegroups.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>
Subject: Re: fs: uninterruptible hang in handle_userfault

On Tue, Mar 1, 2016 at 3:29 AM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
>
> The following program creates an unkillable process in D state:

It seems to be usefaultfd that *tries* to handle signals, but there's
one special fault case where signals won't make it through: when we're
exiting and doing the final child pid clearing access.

We could do this two ways:

(a) special-case the PF_EXITING case for usefaultfd, something like

    diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
    index 50311703135b..66cdb44616d5 100644
    --- a/fs/userfaultfd.c
    +++ b/fs/userfaultfd.c
    @@ -287,6 +287,12 @@ int handle_userfault(struct vm_area_struct
*vma, unsigned long address,
                    goto out;

            /*
    +        * We don't do userfault handling for the final child pid update.
    +        */
    +       if (current->flags & PF_EXITING)
    +               goto out;
    +
    +       /*
             * Check that we can return VM_FAULT_RETRY.
             *
             * NOTE: it should become possible to return VM_FAULT_RETRY

or (b) always consider the exiting case be "fatal signal pending"

    diff --git a/include/linux/sched.h b/include/linux/sched.h
    index a10494a94cc3..5adf9f001df3 100644
    --- a/include/linux/sched.h
    +++ b/include/linux/sched.h
    @@ -2924,7 +2924,7 @@ static inline int
__fatal_signal_pending(struct task_struct *p)

     static inline int fatal_signal_pending(struct task_struct *p)
     {
    -       return signal_pending(p) && __fatal_signal_pending(p);
    +       return (p->flags & PF_EXITING) || (signal_pending(p) &&
__fatal_signal_pending(p));
     }

     static inline int signal_pending_state(long state, struct task_struct *p)

either of which feels a bit hacky to me.

That general "consider the final exit always as if we have a fatal
signal pending" feels like a more generic fix, but it makes me think
that it will fail on NFS-backed mmap's too. That could be seen as a
good thing (avoiding hangs when the NFS server dies), but it also
means that the patch clearly changes *other* semantics too, not just
the usefaultfd case.

So (a) is more targeted, and might be safer.

Does anybody have any other suggestions?

(The above patches are entirely untested, maybe I misread the reason
it might be hanging and it's something else going on).

                        Linus