lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <EB12A50964762B4D8111D55B764A845401169937@scsmsx413.amr.corp.intel.com>
Date:	Mon, 8 Jan 2007 10:37:25 -0800
From:	"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
To:	"Oleg Nesterov" <oleg@...sign.ru>,
	"Srivatsa Vaddagiri" <vatsa@...ibm.com>
Cc:	"Andrew Morton" <akpm@...l.org>,
	"David Howells" <dhowells@...hat.com>,
	"Christoph Hellwig" <hch@...radead.org>,
	"Ingo Molnar" <mingo@...e.hu>,
	"Linus Torvalds" <torvalds@...l.org>,
	<linux-kernel@...r.kernel.org>, "Gautham shenoy" <ego@...ibm.com>
Subject: RE: [PATCH] fix-flush_workqueue-vs-cpu_dead-race-update

 

>-----Original Message-----
>From: linux-kernel-owner@...r.kernel.org 
>[mailto:linux-kernel-owner@...r.kernel.org] On Behalf Of Oleg Nesterov
>Sent: Monday, January 08, 2007 9:07 AM
>To: Srivatsa Vaddagiri
>Cc: Andrew Morton; David Howells; Christoph Hellwig; Ingo 
>Molnar; Linus Torvalds; linux-kernel@...r.kernel.org; Gautham shenoy
>Subject: Re: [PATCH] fix-flush_workqueue-vs-cpu_dead-race-update
>
>On 01/08, Srivatsa Vaddagiri wrote:
>>
>> On Mon, Jan 08, 2007 at 06:56:38PM +0300, Oleg Nesterov wrote:
>> > > 2.
>> > >
>> > > CPU_DEAD->cleanup_workqueue_thread->(cwq->thread = 
>NULL)->kthread_stop() ..
>> > > 				    ^^^^^^^^^^^^^^^^^^^^
>> > > 						|___ Problematic
>> > 
>> > Hmm... This should not be possible? cwq->thread != NULL on 
>CPU_DEAD event.
>> 
>> sure, cwq->thread != NULL at CPU_DEAD event. However
>> cleanup_workqueue_thread() will set it to NULL and block in
>> kthread_stop(), waiting for the kthread to finish run_workqueue and
>> exit.
>
>Ah, missed you point, thanks. Yet another old problem which 
>was not introduced
>by recent changes. And yet another indication we should avoid 
>kthread_stop()
>on CPU_DEAD event :) I believe this is easy to fix, but need 
>to think more.

The current code is workqueue-hptplug path is full of races. I stumbled
upon atleast couple of different deadlock situations being discussed
here with ondemand governor using workqueue and trying to flush during
cpu hot remove.

Specifically, a three way deadlock involving kthread_stop() with
workqueue_mutex held and work itself blocked on some other mutex held by
another task trying to flush the workqueue.

One other approach I was thinking about, was to do all the hardwork in
workqueue CPU_DOWN_PREPARE callback rather than in CPU_DEAD.
We can call cleanup_workqueue_thread and take_over_work in DOWN_PREPARE,
With that, I don't think we need to hold the workqueue_mutex across 
these two callbacks and eliminate the deadlocks related to
flush_workqueue.
Do you think this approach would simply things around here?

Thanks,
Venki 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ