[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211207073543.61092-8-jiangshanlai@gmail.com>
Date: Tue, 7 Dec 2021 15:35:43 +0800
From: Lai Jiangshan <jiangshanlai@...il.com>
To: linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>
Cc: Lai Jiangshan <laijs@...ux.alibaba.com>,
Lai Jiangshan <jiangshanlai@...il.com>
Subject: [RFC PATCH 7/7] workqueue: Replace pool lock with preemption disabling in wq_worker_sleeping()
From: Lai Jiangshan <laijs@...ux.alibaba.com>
Once upon a time, wq_worker_sleeping() was called with rq lock held,
so wq_worker_sleeping() can not use pool lock. Instead it used "X:"
protection: preemption disabled on local cpu and wq_worker_sleeping()
didn't depend on rq lock to work even with it held.
Now, wq_worker_sleeping() isn't called with rq lock held and is using
pool lock. But the functionality of "X:" protection isn't removed and
wq_worker_running() is still using it.
So we can also use "X:" protection in wq_worker_sleeping() and avoid
locking the pool lock.
This patch also documents that only idle_list.next is under "X:"
protection, not the whole idle_list because destroy_worker() in idle
timer can remove non-first idle workers. Idle timer can be possible
strayed in different CPU, or even in the same CPU, it can interrupt
preemption disabled context.
Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
---
kernel/workqueue.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 33f1106b4f99..6c30edbe2fc2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -162,7 +162,8 @@ struct worker_pool {
int nr_workers; /* L: total number of workers */
int nr_idle; /* L: currently idle workers */
- struct list_head idle_list; /* X: list of idle workers */
+ /* idle_list.next and first_idle_worker(): X: first idle worker */
+ struct list_head idle_list; /* L: list of idle workers */
struct timer_list idle_timer; /* L: worker idle timeout */
struct timer_list mayday_timer; /* L: SOS timer for workers */
@@ -819,6 +820,11 @@ static bool too_many_workers(struct worker_pool *pool)
int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
int nr_busy = pool->nr_workers - nr_idle;
+ /*
+ * nr_idle must be at least 2 to allow a manager and at least an idle
+ * worker in idle_list so that idle_worker_timeout() doesn't touch
+ * idle_list.next.
+ */
return nr_idle > 2 && (nr_idle - 2) * MAX_IDLE_WORKERS_RATIO >= nr_busy;
}
@@ -826,7 +832,7 @@ static bool too_many_workers(struct worker_pool *pool)
* Wake up functions.
*/
-/* Return the first idle worker. Safe with preemption disabled */
+/* Return the first idle worker. Safe with "X:" protection */
static struct worker *first_idle_worker(struct worker_pool *pool)
{
if (unlikely(list_empty(&pool->idle_list)))
@@ -905,7 +911,7 @@ void wq_worker_sleeping(struct task_struct *task)
return;
worker->sleeping = 1;
- raw_spin_lock_irq(&pool->lock);
+ preempt_disable();
/*
* Recheck in case unbind_workers() preempted us. We don't
@@ -913,7 +919,7 @@ void wq_worker_sleeping(struct task_struct *task)
* and nr_running has been reset.
*/
if (worker->flags & WORKER_NOT_RUNNING) {
- raw_spin_unlock_irq(&pool->lock);
+ preempt_enable();
return;
}
@@ -923,10 +929,9 @@ void wq_worker_sleeping(struct task_struct *task)
* Please read comment there.
*
* NOT_RUNNING is clear. This means that we're bound to and
- * running on the local cpu w/ rq lock held and preemption
- * disabled, which in turn means that none else could be
- * manipulating idle_list, so dereferencing idle_list without pool
- * lock is safe.
+ * running on the local cpu with preemption disabled, which in turn
+ * means that no one else could be manipulating idle_list.next
+ * so dereferencing idle_list.next without pool lock is safe.
*/
if (atomic_dec_and_test(&pool->nr_running) &&
!list_empty(&pool->worklist)) {
@@ -934,7 +939,7 @@ void wq_worker_sleeping(struct task_struct *task)
if (next)
wake_up_process(next->task);
}
- raw_spin_unlock_irq(&pool->lock);
+ preempt_enable();
}
/**
--
2.19.1.6.gb485710b
Powered by blists - more mailing lists