[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z6PjzA61v6s732JF@pavilion.home>
Date: Wed, 5 Feb 2025 23:18:52 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Mateusz Guzik <mjguzik@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] exit: change the release_task() paths to call
flush_sigqueue() lockless
Le Wed, Feb 05, 2025 at 06:51:59PM +0100, Oleg Nesterov a écrit :
> A task can block a signal, accumulate up to RLIMIT_SIGPENDING sigqueues,
> and exit. In this case __exit_signal()->flush_sigqueue() called with irqs
> disabled can triger a hard lockup, see
> https://lore.kernel.org/all/20190322114917.GC28876@redhat.com/
>
> Fortunately, after the recent posixtimer changes sys_timer_delete() paths
> no longer try to clear SIGQUEUE_PREALLOC and/or free tmr->sigq, and after
> the exiting task passes __exit_signal() lock_task_sighand() can't succeed
> and pid_task(tmr->it_pid) will return NULL.
>
> This means that after __exit_signal(tsk) nobody can play with tsk->pending
> or (if group_dead) with tsk->signal->shared_pending, so release_task() can
> safely call flush_sigqueue() after write_unlock_irq(&tasklist_lock).
>
> Also, kill clear_tsk_thread_flag(TIF_SIGPENDING), it was never needed.
>
> TODO:
> - we can probably shift posix_cpu_timers_exit() as well
> - do_sigaction() can hit the similar problem
>
> Signed-off-by: Oleg Nesterov <oleg@...hat.com>
> ---
> kernel/exit.c | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 3485e5fc499e..bc2c24ea4181 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -200,20 +200,12 @@ static void __exit_signal(struct task_struct *tsk)
> __unhash_process(tsk, group_dead);
> write_sequnlock(&sig->stats_lock);
>
> - /*
> - * Do this under ->siglock, we can race with another thread
> - * doing sigqueue_free() if we have SIGQUEUE_PREALLOC signals.
> - */
> - flush_sigqueue(&tsk->pending);
> tsk->sighand = NULL;
> spin_unlock(&sighand->siglock);
>
> __cleanup_sighand(sighand);
> - clear_tsk_thread_flag(tsk, TIF_SIGPENDING);
Looks good to me, except for this TIF_SIGPENDING removal which I'm less
sure about, I see a lot of places where it is added/removed. Well it's
probably only checked locally on entry code. Would it make sense to move
this chunk to a separate preceding patch? Or keep it here but at least
explain on the changelog why it is safe to remove it?
Thanks!
> - if (group_dead) {
> - flush_sigqueue(&sig->shared_pending);
> + if (group_dead)
> tty_kref_put(tty);
> - }
> }
>
> static void delayed_put_task_struct(struct rcu_head *rhp)
> @@ -279,6 +271,11 @@ void release_task(struct task_struct *p)
> proc_flush_pid(thread_pid);
> put_pid(thread_pid);
> release_thread(p);
> +
> + flush_sigqueue(&p->pending);
> + if (thread_group_leader(p))
> + flush_sigqueue(&p->signal->shared_pending);
> +
> put_task_struct_rcu_user(p);
>
> p = leader;
> --
> 2.25.1.362.g51ebf55
>
>
Powered by blists - more mailing lists