linux-kernel - Re: [RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+bx6Jz0AF5ovoN=WQcbqQkF4Us=2-ACy8rzeML_GPa5Xw@mail.gmail.com>
Date:   Wed, 6 Feb 2019 13:09:52 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        jolsa@...hat.com, Namhyung Kim <namhyung@...nel.org>,
        luca abeni <luca.abeni@...tannapisa.it>,
        syzkaller <syzkaller@...glegroups.com>,
        Oleg Nesterov <oleg@...hat.com>
Subject: Re: [RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not
 in tsk.pending

On Tue, Feb 5, 2019 at 4:26 PM Eric W. Biederman <ebiederm@...ssion.com> wrote:
>
>
> Recently syzkaller was able to create unkillablle processes by
> creating a timer that is delivered as a thread local signal on SIGHUP,
> and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop
> failing to deliver SIGHUP but always trying.
>
> Upon examination it turns out part of the problem is actually most of
> the solution.  Since 2.5 complete_signal has found all fatal signals
> and queued SIGKILL in every threads thread queue relying on
> signal->group_exit_code to preserve the information of which was the
> actual fatal signal.
>
> The conversion of all fatal signals to SIGKILL results in the
> synchronous signal heuristic in next_signal kicking in and preferring
> SIGHUP to SIGKILL.  Which is especially problematic as all
> fatal signals have already been transformed into SIGKILL.
>
> Now that we have task->jobctl we can do better and set a bit in
> task->jobctl instead of reusing tsk->pending[SIGKILL].  This allows
> giving already detected process exits a higher priority than any
> pending signal.
>
> Cc: stable@...r.kernel.org
> Reported-by: Dmitry Vyukov <dvyukov@...gle.com>
> Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4")
> History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
> Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>
> ---
> Oleg, can you give this a quick review and see if I am missing anything?
> Dmitry, can you verify this runs cleanly in your test setups?

Yes, this fixes the hang in my setup.
The process still hangs until I hit ^C or kill, but I guess this is an
intended behavior for such program.
Thanks for the quick fix.

Tested-by: Dmitry Vyukov <dvyukov@...gle.com>



>  fs/coredump.c                |  2 +-
>  include/linux/sched/jobctl.h |  1 +
>  include/linux/sched/signal.h |  2 +-
>  kernel/signal.c              | 10 ++++++++--
>  4 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index e42e17e55bfd..487995293ef0 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -322,7 +322,7 @@ static int zap_process(struct task_struct *start, int exit_code, int flags)
>         for_each_thread(start, t) {
>                 task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
>                 if (t != current && t->mm) {
> -                       sigaddset(&t->pending.signal, SIGKILL);
> +                       t->jobctl |= JOBCTL_TASK_EXIT;
>                         signal_wake_up(t, 1);
>                         nr++;
>                 }
> diff --git a/include/linux/sched/jobctl.h b/include/linux/sched/jobctl.h
> index 98228bd48aee..ff7b3ea43f4c 100644
> --- a/include/linux/sched/jobctl.h
> +++ b/include/linux/sched/jobctl.h
> @@ -18,6 +18,7 @@ struct task_struct;
>  #define JOBCTL_TRAP_NOTIFY_BIT 20      /* trap for NOTIFY */
>  #define JOBCTL_TRAPPING_BIT    21      /* switching to TRACED */
>  #define JOBCTL_LISTENING_BIT   22      /* ptracer is listening for events */
> +#define JOBCTL_TASK_EXIT       23      /* task is exiting */
>
>  #define JOBCTL_STOP_DEQUEUED   (1UL << JOBCTL_STOP_DEQUEUED_BIT)
>  #define JOBCTL_STOP_PENDING    (1UL << JOBCTL_STOP_PENDING_BIT)
> diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
> index 13789d10a50e..3f3edadf1ae1 100644
> --- a/include/linux/sched/signal.h
> +++ b/include/linux/sched/signal.h
> @@ -354,7 +354,7 @@ static inline int signal_pending(struct task_struct *p)
>
>  static inline int __fatal_signal_pending(struct task_struct *p)
>  {
> -       return unlikely(sigismember(&p->pending.signal, SIGKILL));
> +       return unlikely(p->jobctl & JOBCTL_TASK_EXIT);
>  }
>
>  static inline int fatal_signal_pending(struct task_struct *p)
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 9ca8e5278c8e..0577e37fdf43 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -989,7 +989,7 @@ static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
>                         t = p;
>                         do {
>                                 task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
> -                               sigaddset(&t->pending.signal, SIGKILL);
> +                               t->jobctl |= JOBCTL_TASK_EXIT;
>                                 signal_wake_up(t, 1);
>                         } while_each_thread(p, t);
>                         return;
> @@ -1273,7 +1273,7 @@ int zap_other_threads(struct task_struct *p)
>                 /* Don't bother with already dead threads */
>                 if (t->exit_state)
>                         continue;
> -               sigaddset(&t->pending.signal, SIGKILL);
> +               t->jobctl |= JOBCTL_TASK_EXIT;
>                 signal_wake_up(t, 1);
>         }
>
> @@ -2393,6 +2393,11 @@ bool get_signal(struct ksignal *ksig)
>                 goto relock;
>         }
>
> +       /* Has this task already been flagged for death? */
> +       ksig->info.si_signo = signr = SIGKILL;
> +       if (current->jobctl & JOBCTL_TASK_EXIT)
> +               goto fatal;
> +
>         for (;;) {
>                 struct k_sigaction *ka;
>
> @@ -2488,6 +2493,7 @@ bool get_signal(struct ksignal *ksig)
>                         continue;
>                 }
>
> +       fatal:
>                 spin_unlock_irq(&sighand->siglock);
>
>                 /*
> --
> 2.17.1
>