[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0320c5f9-cbda-1652-1f97-24d1a22fb298@gmail.com>
Date: Wed, 13 Jul 2022 17:52:58 +0800
From: Lai Jiangshan <jiangshanlai@...il.com>
To: Schspa Shi <schspa@...il.com>, tj@...nel.org
Cc: linux-kernel@...r.kernel.org, zhaohui.shi@...izon.ai,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] workqueue: Use active mask for new worker when pool is
DISASSOCIATED
CC Peter.
Peter has changed the CPU binding code in workqueue.c.
I'm not understanding the problem enough, if kthread_bind_mask() is buggy
in workqueue.c, it would be buggy in other places too.
On 2022/7/7 17:05, Schspa Shi wrote:
>
> - if (worker->rescue_wq)
> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + if (worker->rescue_wq) {
> + if (pool->flags & POOL_DISASSOCIATED)
> + set_cpus_allowed_ptr(worker->task, cpu_active_mask);
> + else
> + set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + }
>
For unbound pools (which also has POOL_DISASSOCIATED), pool->attrs->cpumask
should be used if pool->attrs->cpumask has active cpu.
> +
> + mutex_lock(&wq_pool_attach_mutex);
> + if ((pool->flags & POOL_DISASSOCIATED)) {
> + /* We can't call get_online_cpus, there will be deadlock
> + * cpu_active_mask will no change, because we have
> + * wq_pool_attach_mutex hold.
> + **/
> + kthread_bind_mask(worker->task, cpu_active_mask);
> + } else {
> + kthread_bind_mask(worker->task, pool->attrs->cpumask);
> + }
> + mutex_unlock(&wq_pool_attach_mutex);
For unbound pools, pool->attrs->cpumask should be used if pool->attrs->cpumask
has active cpu.
wq_pool_attach_mutex is held here and in worker_attach_to_pool() which smells bad.
The change is complex. And if kthread_bind_mask() can't work as expected here,
the change I prefer would be:
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4056f2a3f9d5..1ad8aef5fe98 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1862,6 +1862,12 @@ static void worker_attach_to_pool(struct worker *worker,
{
mutex_lock(&wq_pool_attach_mutex);
+ /*
+ * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
+ * online CPUs. It'll be re-applied when any of the CPUs come up.
+ */
+ set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+
/*
* The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
* stable across this function. See the comments above the flag
@@ -1872,9 +1877,6 @@ static void worker_attach_to_pool(struct worker *worker,
else
kthread_set_per_cpu(worker->task, pool->cpu);
- if (worker->rescue_wq)
- set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
-
list_add_tail(&worker->node, &pool->workers);
worker->pool = pool;
@@ -1952,7 +1954,7 @@ static struct worker *create_worker(struct worker_pool *pool)
goto fail;
set_user_nice(worker->task, pool->attrs->nice);
- kthread_bind_mask(worker->task, pool->attrs->cpumask);
+ worker->flags |= PF_NO_SETAFFINITY;
/* successful, attach the worker to the pool */
worker_attach_to_pool(worker, pool);
@@ -4270,7 +4272,7 @@ static int init_rescuer(struct workqueue_struct *wq)
}
wq->rescuer = rescuer;
- kthread_bind_mask(rescuer->task, cpu_possible_mask);
+ rescuer->flags |= PF_NO_SETAFFINITY;
wake_up_process(rescuer->task);
return 0;
It is untested. It effectively reverts the commit 640f17c82460e
("workqueue: Restrict affinity change to rescuer").
It avoids using kthread_bind_mask().
Powered by blists - more mailing lists