linux-kernel - [PATCH] workqueue: fix possible bug which may silence the pool

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <1362239729-6753-1-git-send-email-laijs@cn.fujitsu.com>
Date:	Sat, 2 Mar 2013 23:55:29 +0800
From:	Lai Jiangshan <laijs@...fujitsu.com>
To:	Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org
Cc:	Lai Jiangshan <laijs@...fujitsu.com>
Subject: [PATCH] workqueue: fix possible bug which may silence the pool

After we introduce multiple pools for cpu pools, a part of the comments
in wq_unbind_fn() becomes wrong.

It said that "current worker would trigger unbound chain execution".
It is wrong. current worker only belongs to one of the multiple pools.

If wq_unbind_fn() does unbind the normal_pri pool(not the pool of the current
worker), the current worker is not the available worker to trigger unbound
chain execution of the normal_pri pool, and if all the workers of
the normal_pri goto sleep after they were set %WORKER_UNBOUND but before
they finish their current work, unbound chain execution is not triggered
totally. The pool is stopped!

We can change wq_unbind_fn() only does unbind one pool and we launch multiple
wq_unbind_fn()s, one for each pool to solve the problem.
But this change will add much latency to hotplug path unnecessarily.

So we choice to wake up a worker directly to trigger unbound chain execution.

current worker may sleep on &second_pool->assoc_mutex, so we also move
the wakeup code into the loop to avoid second_pool silences the first_pool.

Signed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
---
 kernel/workqueue.c |   45 ++++++++++++++++++++++++++-------------------
 1 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 81f2457..03159c2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3446,28 +3446,35 @@ static void wq_unbind_fn(struct work_struct *work)
 
 		spin_unlock_irq(&pool->lock);
 		mutex_unlock(&pool->assoc_mutex);
-	}
 
-	/*
-	 * Call schedule() so that we cross rq->lock and thus can guarantee
-	 * sched callbacks see the %WORKER_UNBOUND flag.  This is necessary
-	 * as scheduler callbacks may be invoked from other cpus.
-	 */
-	schedule();
+		/*
+		 * Call schedule() so that we cross rq->lock and thus can
+		 * guarantee sched callbacks see the %WORKER_UNBOUND flag.
+		 * This is necessary as scheduler callbacks may be invoked
+		 * from other cpus.
+		 */
+		schedule();
 
-	/*
-	 * Sched callbacks are disabled now.  Zap nr_running.  After this,
-	 * nr_running stays zero and need_more_worker() and keep_working()
-	 * are always true as long as the worklist is not empty.  Pools on
-	 * @cpu now behave as unbound (in terms of concurrency management)
-	 * pools which are served by workers tied to the CPU.
-	 *
-	 * On return from this function, the current worker would trigger
-	 * unbound chain execution of pending work items if other workers
-	 * didn't already.
-	 */
-	for_each_std_worker_pool(pool, cpu)
+		/*
+		 * Sched callbacks are disabled now.  Zap nr_running.
+		 * After this, nr_running stays zero and need_more_worker()
+		 * and keep_working() are always true as long as the worklist
+		 * is not empty.  This pool now behave as unbound (in terms of
+		 * concurrency management) pool which are served by workers
+		 * tied to the pool.
+		 */
 		atomic_set(&pool->nr_running, 0);
+
+		/* The current busy workers of this pool may goto sleep without
+		 * wake up any other worker after they were set %WORKER_UNBOUND
+		 * flag. Here we wake up another possible worker to start
+		 * the unbound chain execution of pending work items in this
+		 * case.
+		 */
+		spin_lock_irq(&pool->lock);
+		wake_up_worker(pool);
+		spin_unlock_irq(&pool->lock);
+	}
 }
 
 /*
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/