lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 10 Sep 2012 10:11:25 +0800
From:	Lai Jiangshan <laijs@...fujitsu.com>
To:	Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org
Cc:	Lai Jiangshan <laijs@...fujitsu.com>
Subject: [PATCH 2/2 V7 for-3.6-fixes] workqueue: fix idle worker depletion

If hotplug code grabbed the manager_mutex and worker_thread try to create
a worker, the manage_worker() will return false and worker_thread go to
process work items. Now, on the CPU, all workers are processing work items,
no idle_worker left/ready for managing. It breaks the concept of workqueue
and it is bug.

So when manage_worker() failed to grab the manager_mutex, it should
release gcwq->lock and then grab manager_mutex.

After gcwq->lock is released, hotplug can happen. but the hoplug code
can't unbind/rebind the manager, so the manager should try to rebind
itself unconditionaly, if it fails, unbind itself.

Signed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
---
 kernel/workqueue.c |   31 ++++++++++++++++++++++++++++++-
 1 files changed, 30 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 383548e..74434c8 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1825,10 +1825,39 @@ static bool manage_workers(struct worker *worker)
 	struct worker_pool *pool = worker->pool;
 	bool ret = false;
 
-	if (!mutex_trylock(&pool->manager_mutex))
+	if (pool->flags & POOL_MANAGING_WORKERS)
 		return ret;
 
 	pool->flags |= POOL_MANAGING_WORKERS;
+
+	if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
+		/*
+		 * Ouch! rebind_workers() or gcwq_unbind_fn() beats it.
+		 * it can't return false here, otherwise it will lead to
+		 * worker depletion. So we release gcwq->lock and then
+		 * grab manager_mutex again.
+		 */
+		spin_unlock_irq(&pool->gcwq->lock);
+		mutex_lock(&pool->manager_mutex);
+
+		/*
+		 * The hotplug had happened after the previous releasing
+		 * of gcwq->lock. So we can't assume that this worker is
+		 * still associated or not. And we have to try to rebind it
+		 * via worker_maybe_bind_and_lock(). If it returns false,
+		 * we can conclude that the whole gcwq is disassociated,
+		 * and we must unbind this worker. (hotplug code can't
+		 * unbind/rebind the manager, because hotplug code can't
+		 * iterate the manager)
+		 */
+		if (worker_maybe_bind_and_lock(worker))
+			worker->flags &= ~WORKER_UNBOUND;
+		else
+			worker->flags |= WORKER_UNBOUND;
+
+		ret = true;
+	}
+
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
-- 
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ