[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220719165743.3409313-1-vschneid@redhat.com>
Date: Tue, 19 Jul 2022 17:57:43 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <frederic@...nel.org>,
Juri Lelli <juri.lelli@...hat.com>,
Phil Auld <pauld@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>
Subject: [RFC PATCH] workqueue: Unbind workers before sending them to exit()
It has been reported that isolated CPUs can suffer from interference due to
per-CPU kworkers waking up just to die.
A surge of workqueue activity (sleeping workfn's exacerbate this) during
initial setup can cause extra per-CPU kworkers to be spawned. Then, a
latency-sensitive task can be running merrily on an isolated CPU only to be
interrupted sometime later by a kworker marked for death (cf.
IDLE_WORKER_TIMEOUT, 5 minutes after last kworker activity).
Affine kworkers to the wq_unbound_cpumask (which doesn't contain isolated
CPUs, cf. HK_TYPE_WQ) before waking them up after marking them with
WORKER_DIE.
This follows the logic of CPU hot-unplug, which has been packaged into
helpers for the occasion.
Signed-off-by: Valentin Schneider <vschneid@...hat.com>
---
kernel/workqueue.c | 35 ++++++++++++++++++++++++++---------
1 file changed, 26 insertions(+), 9 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1ea50f6be843..0f1a25ea4924 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1972,6 +1972,18 @@ static struct worker *create_worker(struct worker_pool *pool)
return NULL;
}
+static void unbind_worker(struct worker *worker)
+{
+ kthread_set_per_cpu(worker->task, -1);
+ WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, wq_unbound_cpumask) < 0);
+}
+
+static void rebind_worker(struct worker *worker, struct worker_pool *pool)
+{
+ kthread_set_per_cpu(worker->task, pool->cpu);
+ WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask) < 0);
+}
+
/**
* destroy_worker - destroy a workqueue worker
* @worker: worker to be destroyed
@@ -1999,6 +2011,16 @@ static void destroy_worker(struct worker *worker)
list_del_init(&worker->entry);
worker->flags |= WORKER_DIE;
+
+ /*
+ * We're sending that thread off to die, so any CPU would do. This is
+ * especially relevant for pcpu kworkers affined to an isolated CPU:
+ * we'd rather not interrupt an isolated CPU just for a kworker to
+ * do_exit().
+ */
+ if (!(worker->flags & WORKER_UNBOUND))
+ unbind_worker(worker);
+
wake_up_process(worker->task);
}
@@ -4999,10 +5021,8 @@ static void unbind_workers(int cpu)
raw_spin_unlock_irq(&pool->lock);
- for_each_pool_worker(worker, pool) {
- kthread_set_per_cpu(worker->task, -1);
- WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, wq_unbound_cpumask) < 0);
- }
+ for_each_pool_worker(worker, pool)
+ unbind_worker(worker);
mutex_unlock(&wq_pool_attach_mutex);
}
@@ -5027,11 +5047,8 @@ static void rebind_workers(struct worker_pool *pool)
* of all workers first and then clear UNBOUND. As we're called
* from CPU_ONLINE, the following shouldn't fail.
*/
- for_each_pool_worker(worker, pool) {
- kthread_set_per_cpu(worker->task, pool->cpu);
- WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
- pool->attrs->cpumask) < 0);
- }
+ for_each_pool_worker(worker, pool)
+ rebind_worker(worker, pool);
raw_spin_lock_irq(&pool->lock);
--
2.31.1
Powered by blists - more mailing lists