linux-kernel - Re: [PATCH] flush_cpu_workqueue: don't flush an empty ->worklist

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070116132725.GA81@tv-sign.ru>
Date:	Tue, 16 Jan 2007 16:27:25 +0300
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Srivatsa Vaddagiri <vatsa@...ibm.com>
Cc:	Andrew Morton <akpm@...l.org>, David Howells <dhowells@...hat.com>,
	Christoph Hellwig <hch@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Linus Torvalds <torvalds@...l.org>,
	linux-kernel@...r.kernel.org, Gautham shenoy <ego@...ibm.com>,
	"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
Subject: Re: [PATCH] flush_cpu_workqueue: don't flush an empty ->worklist

On 01/16, Srivatsa Vaddagiri wrote:
>
> On Mon, Jan 15, 2007 at 07:55:16PM +0300, Oleg Nesterov wrote:
> >
> > As I said already __create_workqueue() needs a fix, schedule_on_each_cpu()
> > is already broken, and should be fixed as well.
> 
> __create_workqueue() creates worker threads for all online CPUs
> currently. Accessing the online_map could be racy unless we 
> serialize the access with hotplug event (thr' a mutex like workqueue
> mutex held between LOCK_ACQ/LOCK_RELEASE messages or process freezer)
> OR take special measures as was done in flush_workqueue. How were
> you considering to deal with that raciness?

This is easy to fix in multiple ways. For example (this is not the best
approach), we can make cpu_creation_map. Like cpu_populate_map, but CPU
is cleared on CPU_DEAD/CPU_UP_CANCELED.

I'll continue when I have a time.

> > > What abt stopping that thread in CPU_DOWN_PREPARE (before freezing
> > > processes)? I understand that it may add to the latency, but compared to
> > > the overall latency of process freezer, I suspect it may not be much.
> 
> I meant issuing kthread_stop() in DOWN_PREPARE so that worker
> thread exits itself (much before CPU is actually brought down).

Deadlock if work_struct re-queues itself.

> Even if there are problems with it, how abt something like below:
> 
> workqueue_cpu_callback()
> {
> 
> 	CPU_DEAD:
> 		/* threads are still frozen at this point */
> 		take_over_work();

No, we can't move a currently executing work. This will break flush_workxxx().

> 		kthread_mark_stop(worker_thread);
> 		break;
> 
> 	CPU_CLEAN_THREADS:
> 		/* all threads resumed by now */
> 		kthread_stop(worker_thread); /* task_struct ref required? */

yes, required,

> kthread_mark_stop() will mark somewhere in task_struct that the thread
> should exit when it comes out of refrigerator.
> 
> worker_thread()
> {
> 	
>         while (!kthread_should_stop()) {
>                 if (cwq->freezeable)
>                         try_to_freeze();
> 
> 		if (kthread_marked_stop(current))
> 			break;

Racy. Because of kthread_stop() above we should clear cwq->thread somehow.
But we can't do this: this workqueue may be already destroyed.

> The advantage I see above is that, when take_over_work() is running, we wont 
> race with functions like flush_workqueue() (threads still frozen at that
> point) and hence we avoid hacks like migrate_sequence.

See above.

Please note that the code I posted does something like kthread_mark_stop(), but
it operates on cwq, not on task_struct, this makes a big difference. And it doesn't
need take_over_work() at all. And it doesn't need additional complications. Instead,
it lessens both the source and compiled code.

Note that I am not against using freezer for cpu-hotplug. My point is that this
can't help much for the current workqueue.c, and it will completely transparent
with the change I suggest.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/