[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <EB12A50964762B4D8111D55B764A845401169937@scsmsx413.amr.corp.intel.com>
Date: Mon, 8 Jan 2007 10:37:25 -0800
From: "Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
To: "Oleg Nesterov" <oleg@...sign.ru>,
"Srivatsa Vaddagiri" <vatsa@...ibm.com>
Cc: "Andrew Morton" <akpm@...l.org>,
"David Howells" <dhowells@...hat.com>,
"Christoph Hellwig" <hch@...radead.org>,
"Ingo Molnar" <mingo@...e.hu>,
"Linus Torvalds" <torvalds@...l.org>,
<linux-kernel@...r.kernel.org>, "Gautham shenoy" <ego@...ibm.com>
Subject: RE: [PATCH] fix-flush_workqueue-vs-cpu_dead-race-update
>-----Original Message-----
>From: linux-kernel-owner@...r.kernel.org
>[mailto:linux-kernel-owner@...r.kernel.org] On Behalf Of Oleg Nesterov
>Sent: Monday, January 08, 2007 9:07 AM
>To: Srivatsa Vaddagiri
>Cc: Andrew Morton; David Howells; Christoph Hellwig; Ingo
>Molnar; Linus Torvalds; linux-kernel@...r.kernel.org; Gautham shenoy
>Subject: Re: [PATCH] fix-flush_workqueue-vs-cpu_dead-race-update
>
>On 01/08, Srivatsa Vaddagiri wrote:
>>
>> On Mon, Jan 08, 2007 at 06:56:38PM +0300, Oleg Nesterov wrote:
>> > > 2.
>> > >
>> > > CPU_DEAD->cleanup_workqueue_thread->(cwq->thread =
>NULL)->kthread_stop() ..
>> > > ^^^^^^^^^^^^^^^^^^^^
>> > > |___ Problematic
>> >
>> > Hmm... This should not be possible? cwq->thread != NULL on
>CPU_DEAD event.
>>
>> sure, cwq->thread != NULL at CPU_DEAD event. However
>> cleanup_workqueue_thread() will set it to NULL and block in
>> kthread_stop(), waiting for the kthread to finish run_workqueue and
>> exit.
>
>Ah, missed you point, thanks. Yet another old problem which
>was not introduced
>by recent changes. And yet another indication we should avoid
>kthread_stop()
>on CPU_DEAD event :) I believe this is easy to fix, but need
>to think more.
The current code is workqueue-hptplug path is full of races. I stumbled
upon atleast couple of different deadlock situations being discussed
here with ondemand governor using workqueue and trying to flush during
cpu hot remove.
Specifically, a three way deadlock involving kthread_stop() with
workqueue_mutex held and work itself blocked on some other mutex held by
another task trying to flush the workqueue.
One other approach I was thinking about, was to do all the hardwork in
workqueue CPU_DOWN_PREPARE callback rather than in CPU_DEAD.
We can call cleanup_workqueue_thread and take_over_work in DOWN_PREPARE,
With that, I don't think we need to hold the workqueue_mutex across
these two callbacks and eliminate the deadlocks related to
flush_workqueue.
Do you think this approach would simply things around here?
Thanks,
Venki
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists