lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250820105518.Yf36NzJd@linutronix.de>
Date: Wed, 20 Aug 2025 12:55:18 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] softirq: Provide a handshake for canceling tasklets via
 polling on PREEMPT_RT

On 2025-08-20 12:36:59 [+0200], To Tejun Heo wrote:
> Subject: [PATCH] workqueue: Provide a handshake for canceling BH workers
…
> This will flush all BH-work items assigned to that pool.

We need to flush all items because the inserted wq_barrier is at the
end of the queue. So if the cb_lock is dropped after
worker->current_func(work) then we will live lock. Just tested, I
somehow assumed it polls on worker.

This behaviour is undesired and goes back to the requirement to be able
to cancel something from an atomic context which can't be atomic on
PREEMPT_RT to begin with.

Since the caller can never be atomic on PREEMPT_RT, what about the
following:

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c6b79b3675c31..6ce9c980a7966 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4214,28 +4214,16 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
 	 * can't currently be queued. Its data must contain OFFQ bits. If @work
 	 * was queued on a BH workqueue, we also know that it was running in the
 	 * BH context and thus can be busy-waited.
+	 * On PREEMPT_RT the BH context can not be busy-waited because it can be
+	 * preempted by the caller if it has higher priority.
 	 */
-	if (from_cancel) {
+	if (from_cancel && !IS_ENABLED(CONFIG_PREEMPT_RT)) {
 		unsigned long data = *work_data_bits(work);
 
 		if (!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) &&
 		    (data & WORK_OFFQ_BH)) {
-			/*
-			 * On RT, prevent a live lock when %current preempted
-			 * soft interrupt processing or prevents ksoftirqd from
-			 * running by keeping flipping BH. If the BH work item
-			 * runs on a different CPU then this has no effect other
-			 * than doing the BH disable/enable dance for nothing.
-			 * This is copied from
-			 * kernel/softirq.c::tasklet_unlock_spin_wait().
-			 */
 			while (!try_wait_for_completion(&barr.done)) {
-				if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
-					local_bh_disable();
-					local_bh_enable();
-				} else {
-					cpu_relax();
-				}
+				cpu_relax();
 			}
 			goto out_destroy;
 		}
@@ -4351,7 +4339,7 @@ static bool __cancel_work_sync(struct work_struct *work, u32 cflags)
 
 	ret = __cancel_work(work, cflags | WORK_CANCEL_DISABLE);
 
-	if (*work_data_bits(work) & WORK_OFFQ_BH)
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) && *work_data_bits(work) & WORK_OFFQ_BH)
 		WARN_ON_ONCE(in_hardirq());
 	else
 		might_sleep();


It is not the revert I suggested.
This should work for softirq caller and from forced-thread interrupt
(not that I encourage such behaviour).
It will not work from an atomic context such as with raw_spinlock_t
acquired but this will also not work with the current
(local_bh_disable() + enable()) solution.

I prefer this because it avoids the locking bh_worker() and the flushing
of all pending work items the flush/ cancel case.

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ