[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211008100454.2802393-3-boqun.feng@gmail.com>
Date: Fri, 8 Oct 2021 18:04:54 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: linux-kernel@...r.kernel.org
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
"Paul E . McKenney" <paulmck@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <frederic@...nel.org>,
Boqun Feng <boqun.feng@...il.com>
Subject: [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue
When requeuing a work to a different workqueue while it's still getting
processed, re-entrace as the follow can happen:
{ both WQ1 and WQ2 are bounded workqueue, and a work W has been
queued on CPU0 for WQ1}
CPU 0 CPU 1
===== ====
<In worker on CPU 0>
process_one_work():
...
// pick up W
worker->current_work = W;
worker->current_func = W->func;
...
set_work_pool_and_clear_pending(...);
// W can be requeued afterwards
queue_work_on(1, WQ2, W):
if (!test_and_set_bit(...)) {
// this branch is taken, as CPU 0
// just clears pending bit.
__queue_work(...):
pwq = <pool for CPU1 of WQ2>;
last_pool = <pool for CPU 0 of WQ1>;
if (last_pool != pwq->pool) { // true
if (.. && worker->current_pwq->wq == wq) {
// false, since @worker is a
// a worker of @last_pool (for
// WQ1), and @wq is WQ2.
}
...
insert_work(pwq, W, ...);
}
// W queued.
<schedule to worker on CPU 1>
process_one_work():
collision = find_worker_executing_work(..);
// NULL, because we're searching the
// worker pool of CPU 1, while W is
// the current work on worker pool of
// CPU 0.
worker->current_work = W;
worker->current_func = W->func;
worker->current_func(...);
...
worker->current_func(...); // Re-entrance
This issue is already partially fixed because in queue_work_on(),
last_pool can be used to queue the work, as a result the requeued work
processing will find the collision and wait for the existing one to
finish. However, currently the last_pool is only used when two
workqueues are the same one, which causes the issue. Therefore extend
the behavior to allow last_pool to requeue the work W even if the
workqueues are different. It's safe to do this since the work W has been
proved safe to queue and run on the last_pool.
Signed-off-by: Boqun Feng <boqun.feng@...il.com>
---
kernel/workqueue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1418710bffcd..410141cc5f88 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1465,7 +1465,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
worker = find_worker_executing_work(last_pool, work);
- if (worker && worker->current_pwq->wq == wq) {
+ if (worker) {
pwq = worker->current_pwq;
} else {
/* meh... not running there, queue here */
--
2.32.0
Powered by blists - more mailing lists