[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <200612061726.14587.bjorn.helgaas@hp.com>
Date: Wed, 6 Dec 2006 17:26:14 -0700
From: Bjorn Helgaas <bjorn.helgaas@...com>
To: Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org
Cc: Myron Stowe <myron.stowe@...com>, Jens Axboe <axboe@...nel.dk>
Subject: workqueue deadlock
I'm seeing a workqueue-related deadlock. This is on an ia64
box running SLES10, but it looks like the same problem should
be possible in current upstream on any architecture.
Here are the two tasks involved:
events/4:
schedule
__down
__lock_cpu_hotplug
lock_cpu_hotplug
flush_workqueue
kblockd_flush
blk_sync_queue
cfq_shutdown_timer_wq
cfq_exit_queue
elevator_exit
blk_cleanup_queue
scsi_free_queue
scsi_device_dev_release_usercontext
run_workqueue
loadkeys:
schedule
flush_cpu_workqueue
flush_workqueue
flush_scheduled_work
release_dev
tty_release
loadkeys is holding the cpu_hotplug lock (acquired in flush_workqueue())
and waiting in flush_cpu_workqueue() until the cpu_workqueue drains.
But events/4 is responsible for draining it, and it is blocked waiting
to acquire the cpu_hotplug lock.
In current upstream, the cpu_hotplug lock has been replaced with
workqueue_mutex, but it looks to me like the same deadlock is still
possible.
Is there some rule that workqueue functions shouldn't try to
flush a workqueue? Or should flush_workqueue() be smarter by
releasing the workqueue_mutex once in a while?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists