linux-kernel - Re: [PATCH] softirq: Provide a handshake for canceling tasklets via polling on PREEMPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250820105518.Yf36NzJd@linutronix.de>
Date: Wed, 20 Aug 2025 12:55:18 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
	Lai Jiangshan <jiangshanlai@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] softirq: Provide a handshake for canceling tasklets via
 polling on PREEMPT_RT

On 2025-08-20 12:36:59 [+0200], To Tejun Heo wrote:
> Subject: [PATCH] workqueue: Provide a handshake for canceling BH workers
…
> This will flush all BH-work items assigned to that pool.

We need to flush all items because the inserted wq_barrier is at the
end of the queue. So if the cb_lock is dropped after
worker->current_func(work) then we will live lock. Just tested, I
somehow assumed it polls on worker.

This behaviour is undesired and goes back to the requirement to be able
to cancel something from an atomic context which can't be atomic on
PREEMPT_RT to begin with.

Since the caller can never be atomic on PREEMPT_RT, what about the
following:

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c6b79b3675c31..6ce9c980a7966 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4214,28 +4214,16 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
 	 * can't currently be queued. Its data must contain OFFQ bits. If @work
 	 * was queued on a BH workqueue, we also know that it was running in the
 	 * BH context and thus can be busy-waited.
+	 * On PREEMPT_RT the BH context can not be busy-waited because it can be
+	 * preempted by the caller if it has higher priority.
 	 */
-	if (from_cancel) {
+	if (from_cancel && !IS_ENABLED(CONFIG_PREEMPT_RT)) {
 		unsigned long data = *work_data_bits(work);
 
 		if (!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) &&
 		    (data & WORK_OFFQ_BH)) {
-			/*
-			 * On RT, prevent a live lock when %current preempted
-			 * soft interrupt processing or prevents ksoftirqd from
-			 * running by keeping flipping BH. If the BH work item
-			 * runs on a different CPU then this has no effect other
-			 * than doing the BH disable/enable dance for nothing.
-			 * This is copied from
-			 * kernel/softirq.c::tasklet_unlock_spin_wait().
-			 */
 			while (!try_wait_for_completion(&barr.done)) {
-				if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
-					local_bh_disable();
-					local_bh_enable();
-				} else {
-					cpu_relax();
-				}
+				cpu_relax();
 			}
 			goto out_destroy;
 		}
@@ -4351,7 +4339,7 @@ static bool __cancel_work_sync(struct work_struct *work, u32 cflags)
 
 	ret = __cancel_work(work, cflags | WORK_CANCEL_DISABLE);
 
-	if (*work_data_bits(work) & WORK_OFFQ_BH)
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) && *work_data_bits(work) & WORK_OFFQ_BH)
 		WARN_ON_ONCE(in_hardirq());
 	else
 		might_sleep();


It is not the revert I suggested.
This should work for softirq caller and from forced-thread interrupt
(not that I encourage such behaviour).
It will not work from an atomic context such as with raw_spinlock_t
acquired but this will also not work with the current
(local_bh_disable() + enable()) solution.

I prefer this because it avoids the locking bh_worker() and the flushing
of all pending work items the flush/ cancel case.

Sebastian