lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240708104221.GA18761@redhat.com>
Date: Mon, 8 Jul 2024 12:42:21 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Pavel Begunkov <asml.silence@...il.com>
Cc: io-uring@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Christian Brauner <brauner@...nel.org>,
	Tycho Andersen <tandersen@...flix.com>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
	Julian Orth <ju.orth@...il.com>
Subject: Re: [PATCH 2/2] kernel: rerun task_work while freezing in
 get_signal()

On 07/07, Pavel Begunkov wrote:
>
> io_uring can asynchronously add a task_work while the task is getting
> freezed. TIF_NOTIFY_SIGNAL will prevent the task from sleeping in
> do_freezer_trap(), and since the get_signal()'s relock loop doesn't
> retry task_work, the task will spin there not being able to sleep
> until the freezing is cancelled / the task is killed / etc.
> 
> Cc: stable@...r.kernel.org
> Link: https://github.com/systemd/systemd/issues/33626
> Fixes: 3146cba99aa28 ("io-wq: make worker creation resilient against signals")

I don't think we should blame io_uring even if so far it is the only user
of TWA_SIGNAL.

Perhaps we should change do_freezer_trap() somehow, not sure... It assumes
that TIF_SIGPENDING is the only reason to not sleep in TASK_INTERRUPTIBLE,
today this is not true.

> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2694,6 +2694,10 @@ bool get_signal(struct ksignal *ksig)
>  	try_to_freeze();
>  
>  relock:
> +	clear_notify_signal();
> +	if (unlikely(task_work_pending(current)))
> +		task_work_run();
> +
>  	spin_lock_irq(&sighand->siglock);

Well, but can't we kill the same code at the start of get_signal() then?
Of course, in this case get_signal() should check signal_pending(), not
task_sigpending().

Or perhaps something like the patch below makes more sense? I dunno...

Oleg.

diff --git a/kernel/signal.c b/kernel/signal.c
index 1f9dd41c04be..e2ae85293fbb 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2676,6 +2676,7 @@ bool get_signal(struct ksignal *ksig)
 	struct signal_struct *signal = current->signal;
 	int signr;
 
+start:
 	clear_notify_signal();
 	if (unlikely(task_work_pending(current)))
 		task_work_run();
@@ -2760,10 +2761,11 @@ bool get_signal(struct ksignal *ksig)
 			if (current->jobctl & JOBCTL_TRAP_MASK) {
 				do_jobctl_trap();
 				spin_unlock_irq(&sighand->siglock);
+				goto relock;
 			} else if (current->jobctl & JOBCTL_TRAP_FREEZE)
 				do_freezer_trap();
-
-			goto relock;
+				goto start;
+			}
 		}
 
 		/*


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ