linux-kernel - Re: [RFC v2][PATCH] create workqueue threads only when needed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090128233112.GB7631@nowhere>
Date:	Thu, 29 Jan 2009 00:31:14 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Steven Rostedt <rostedt@...dmis.org>,
	Alasdair G Kergon <agk@...hat.com>,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: [RFC v2][PATCH] create workqueue threads only when needed

On Wed, Jan 28, 2009 at 04:02:24AM +0100, Oleg Nesterov wrote:
> On 01/28, Frederic Weisbecker wrote:
> >
> > +static void workqueue_unshadow(struct cpu_workqueue_struct *cwq)
> > +{
> > +	struct workqueue_shadow *ws;
> > +
> > +	/* Prevent from concurrent unshadowing */
> > +	if (unlikely(atomic_inc_return(&cwq->unshadowed) != 1))
> > +		goto already_unshadowed;
> > +
> > +	/*
> > +	 * The work can be inserted whatever is the context.
> > +	 * But such atomic allocation will be rare and freed soon.
> > +	 */
> > +	ws = kmalloc(sizeof(*ws), GFP_ATOMIC);
> > +	if (!ws) {
> > +		WARN_ON_ONCE(1);
> > +		goto already_unshadowed;
> > +	}
> > +	INIT_DELAYED_WORK(&ws->work, workqueue_unshadow_work);
> > +	ws->cwq = cwq;
> > +	schedule_delayed_work(&ws->work, 0);
> > +
> > +	return;
> > +
> > +already_unshadowed:
> > +	atomic_dec(&cwq->unshadowed);
> > +}
> 
> Can't understand why do you use delayed work...
> 
> I must admit, I don't like this patch. Perhaps I am wrong, mostly I
> dislike the complications it adds.
> 
> Anybody else please vote for this change?
> 
> Hmm. We never reset cwq->unshadowed, so cwq->thread becomes "non-lazy"
> after cpu_down() + cpu_up().
> 
> And. Of course it is not good that queue_work() can silently fail just
> because GFP_ATOMIC fails. This is not acceptable, imho. But fixable.
> 
> What is not fixable is that this patch adds a subtle lock-ordering
> problem. With this patch any flush_work() or flush_workqueue() or
> destroy_workqueue() depend on keventd, and can deadlock if the caller
> shares the lock with any work_struct on keventd.
> 
> Or. let's suppose keventd has a sleeping work_struct which waits
> for the event. Now we queue the work which should "implement"
> this event on !unshadowed wq - deadlock.
> 
> Another problem. workqueue_unshadow_work() populates cwq->thread and
> binds it to smp_processor_id(). This is not safe, CPU can go away
> after smp_processor_id() but before wake_up_process().
> 
> Oh, and schedule_delayed_work() is not right, think about queue_work_on().


Hi Oleg,

I was about to retry the same approach but through async functions (kernel/async.c)
which would have solved the possible deadlock you described and would have made
the synchronizations easier.

But actually this is only a half solution:

Pointless workqueues stay pointless, even if they don't appear before they run once.
And moreover this is hiding the real problem: parts of the kernel use dedicated workqueues
while kevent is sufficient most of the time.

And if there are problems with using the workqueues, because of deadlocks or slow
works, async functions are a good solution.

I think (more precisely I agree with you) these callsites should be fixed
individually. I shall fix some of them if I can with the limited hardware I can test.

Thanks.

 
> >  static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq)
> >  {
> > +	DEFINE_WAIT(wait);
> > +	long timeout = 0;
> > +	int unshadowed = atomic_read(&cwq->unshadowed);
> > +
> > +	/* Shadowed => no thread has been created */
> > +	if (!unshadowed)
> > +		return;
> 
> This is not right, if the previous workqueue_unshadow() failed, we
> can return with the pending works.
> 
> > +
> >  	/*
> > -	 * Our caller is either destroy_workqueue() or CPU_POST_DEAD,
> > -	 * cpu_add_remove_lock protects cwq->thread.
> > +	 * If it's unshadowed, we want to ensure the thread creation
> > +	 * has been completed.
> >  	 */
> > -	if (cwq->thread == NULL)
> > -		return;
> > +	prepare_to_wait(&cwq->thread_creation, &wait, TASK_UNINTERRUPTIBLE);
> > +	if (!cwq->thread)
> > +		timeout = schedule_timeout_interruptible(HZ * 3);
> > +	finish_wait(&cwq->thread_creation, &wait);
> > +
> > +	/* We waited for 3 seconds, this is likely a soft lockup */
> > +	WARN_ON(timeout);
> 
> Can't understand... If timeout != 0, then we were woken by
> workqueue_unshadow_work() ?
> 
> Anyway. We should not proceed if we failed to create cwq->thread.
> The kernel can crash. And of course this is not good too. Yes,
> you modified flush_cpu_workqueue() to call workqueue_unshadow(),
> but this can fail too. And if another thread cancels the pending
> works, flush_cpu_workqueue() just returns, and we crash. Or we
> can hang forever.
> 
> Also. Please note that cleanup_workqueue_thread() can also be
> called by CPU_UP_CANCELED when cwq->thread == NULL because it
> was never created. We should do nothing in this case, but we
> will hang if cwq->unshadowed != 0.
> 
> >  		switch (action) {
> >  		case CPU_UP_PREPARE:
> > +			/* Will be created during the first work insertion */
> > +			if (!atomic_read(&cwq->unshadowed))
> > +				break;
> >  			if (!create_workqueue_thread(cwq, cpu))
> >  				break;
> >  			printk(KERN_ERR "workqueue [%s] for %i failed\n",
> > @@ -964,6 +1086,8 @@ undo:
> >  			goto undo;
> >
> >  		case CPU_ONLINE:
> > +			if (!atomic_read(&cwq->unshadowed))
> > +				break;
> >  			start_workqueue_thread(cwq, cpu);
> >  			break;
> 
> Suppose that we have some strange cpu_callback(action, cpu)
> which does:
> 
> 		case CPU_UP_PREPARE:
> 			queue_work_on(cpu, my_wq, percpu_work);
> 			break;
> 		case CPU_UP_CANCELED:
> 			cancel_work_sync(percpu_work);
> 
> Currently this works. But with this patch, queue_work_on() above
> can leak workqueue_shadow.
> 
> Oleg.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/