lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a5jpqamx.fsf@email.froward.int.ebiederm.org>
Date: Thu, 13 Jun 2024 07:40:06 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,  Rachel Menge
 <rachelmenge@...ux.microsoft.com>,  linux-kernel@...r.kernel.org,
  rcu@...r.kernel.org,  Wei Fu <fuweid89@...il.com>,
  apais@...ux.microsoft.com,  Sudhanva Huruli
 <Sudhanva.Huruli@...rosoft.com>,  Jens Axboe <axboe@...nel.dk>,  Christian
 Brauner <brauner@...nel.org>,  Mike Christie
 <michael.christie@...cle.com>,  Joel Granados <j.granados@...sung.com>,
  Mateusz Guzik <mjguzik@...il.com>,  "Paul E. McKenney"
 <paulmck@...nel.org>,  Frederic Weisbecker <frederic@...nel.org>,  Neeraj
 Upadhyay <neeraj.upadhyay@...nel.org>,  Joel Fernandes
 <joel@...lfernandes.org>,  Josh Triplett <josh@...htriplett.org>,  Boqun
 Feng <boqun.feng@...il.com>,  Steven Rostedt <rostedt@...dmis.org>,
  Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,  Lai Jiangshan
 <jiangshanlai@...il.com>,  Zqiang <qiang.zhang1211@...il.com>
Subject: Re: [PATCH] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along
 with TIF_SIGPENDING

Oleg Nesterov <oleg@...hat.com> writes:

> kernel_wait4() doesn't sleep and returns -EINTR if there is no
> eligible child and signal_pending() is true.
>
> That is why zap_pid_ns_processes() clears TIF_SIGPENDING but this is not
> enough, it should also clear TIF_NOTIFY_SIGNAL to make signal_pending()
> return false and avoid a busy-wait loop.

I took a look through the code.  It used to be that TIF_NOTIFY_SIGNAL
was all about waking up a task so that task_work_run can be used.
io_uring still mostly uses it that way.  There is also a use in
kthread_stop that just uses it as a TIF_SIGPENDING without having a
pending signal.

At the point in do_exit where exit_notify and thus zap_pid_ns_processes
is called I can't possibly see a use for TIF_NOTIFY_SIGNAL.
exit_task_work, exit_signals, and io_uring_cancel have all been called.

So TIF_NOTIFY_SIGNAL should be spurious at this point and safe to clear.
Why it remains set is a mystery to me.


If I had infinite time and energy the ideal is to rework the pid
namespace exit logic so that waiting for everything to exit works like
delay_group_leader in wait_task_consider.  Simply blocking reaping of
the pid namespace leader until everything in the pid namespace have been
reaped.  I think acct_exit_ns is the only piece of code that needs
to be moved to allow that, and acct_exit_ns is purely bookkeeping so
does not affect userspace visible semantics.

This active waiting is weird and non-standard in the kernel and winds up
causeing a problem every couple of years because of that.

>
> Fixes: 12db8b690010 ("entry: Add support for TIF_NOTIFY_SIGNAL")
> Reported-by: Rachel Menge <rachelmenge@...ux.microsoft.com>
> Closes: https://lore.kernel.org/all/1386cd49-36d0-4a5c-85e9-bc42056a5a38@linux.microsoft.com/
> Signed-off-by: Oleg Nesterov <oleg@...hat.com>
> ---
>  kernel/pid_namespace.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> index dc48fecfa1dc..25f3cf679b35 100644
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -218,6 +218,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
>  	 */
>  	do {
>  		clear_thread_flag(TIF_SIGPENDING);
> +		clear_thread_flag(TIF_NOTIFY_SIGNAL);
>  		rc = kernel_wait4(-1, NULL, __WALL, NULL);
>  	} while (rc != -ECHILD);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ