linux-kernel - Re: [PATCH] workqueue: not allow recursion run

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49780C3A.1050601@cn.fujitsu.com>
Date:	Thu, 22 Jan 2009 14:03:38 +0800
From:	Lai Jiangshan <laijs@...fujitsu.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Oleg Nesterov <oleg@...sign.ru>, Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] workqueue: not allow recursion run_workqueue

Peter Zijlstra wrote:
> On Wed, 2009-01-21 at 17:42 +0800, Lai Jiangshan wrote:
>> 1) lockdep will complain when recursion run_workqueue
>> 2) works is not run orderly when recursion run_workqueue
>>
>> 3) BUG!
>>    We use recursion run_workqueue to hidden deadlock when
>>    keventd trying to flush its own queue.
>>
>>    It's bug. When flush_workqueue()(nested in a work callback)returns,
>>    the workqueue is not really flushed, the sequence statement of
>>    this work callback will do some thing bad.
>>
>>    So we should not allow workqueue trying to flush its own queue.
> 
> The patch looks good, but I'm utterly failing to comprehend this
> changelog. What exactly can go wrong (other than the obvious too deep
> nest and the fact that lockdep will complain)?

void do_some_cleanup(void)
{
	find_all_queued_work_struct_and_mark_it_old();
	flush_workqueue(workqueue);
	/* we can destroy old work_struct for we have flushed them */
	destroy_old_work_structs();
}

if work->func() called do_some_cleanup(), it's very probably a bug.

> 
>> Signed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
>> ---
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index 2f44583..1129cde 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -48,8 +48,6 @@ struct cpu_workqueue_struct {
>>  
>>  	struct workqueue_struct *wq;
>>  	struct task_struct *thread;
>> -
>> -	int run_depth;		/* Detect run_workqueue() recursion depth */
>>  } ____cacheline_aligned;
>>  
>>  /*
>> @@ -262,13 +260,6 @@ EXPORT_SYMBOL_GPL(queue_delayed_work_on);
>>  static void run_workqueue(struct cpu_workqueue_struct *cwq)
>>  {
>>  	spin_lock_irq(&cwq->lock);
>> -	cwq->run_depth++;
>> -	if (cwq->run_depth > 3) {
>> -		/* morton gets to eat his hat */
>> -		printk("%s: recursion depth exceeded: %d\n",
>> -			__func__, cwq->run_depth);
>> -		dump_stack();
>> -	}
>>  	while (!list_empty(&cwq->worklist)) {
>>  		struct work_struct *work = list_entry(cwq->worklist.next,
>>  						struct work_struct, entry);
>> @@ -311,7 +302,6 @@ static void run_workqueue(struct cpu_workqueue_struct *cwq)
>>  		spin_lock_irq(&cwq->lock);
>>  		cwq->current_work = NULL;
>>  	}
>> -	cwq->run_depth--;
>>  	spin_unlock_irq(&cwq->lock);
>>  }
>>  
>> @@ -368,29 +358,20 @@ static void insert_wq_barrier(struct cpu_workqueue_struct *cwq,
>>  
>>  static int flush_cpu_workqueue(struct cpu_workqueue_struct *cwq)
>>  {
>> -	int active;
>> +	int active = 0;
>> +	struct wq_barrier barr;
>>  
>> -	if (cwq->thread == current) {
>> -		/*
>> -		 * Probably keventd trying to flush its own queue. So simply run
>> -		 * it by hand rather than deadlocking.
>> -		 */
>> -		run_workqueue(cwq);
>> -		active = 1;
>> -	} else {
>> -		struct wq_barrier barr;
>> +	WARN_ON(cwq->thread == current);
>>  
>> -		active = 0;
>> -		spin_lock_irq(&cwq->lock);
>> -		if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
>> -			insert_wq_barrier(cwq, &barr, &cwq->worklist);
>> -			active = 1;
>> -		}
>> -		spin_unlock_irq(&cwq->lock);
>> -
>> -		if (active)
>> -			wait_for_completion(&barr.done);
>> +	spin_lock_irq(&cwq->lock);
>> +	if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
>> +		insert_wq_barrier(cwq, &barr, &cwq->worklist);
>> +		active = 1;
>>  	}
>> +	spin_unlock_irq(&cwq->lock);
>> +
>> +	if (active)
>> +		wait_for_completion(&barr.done);
>>  
>>  	return active;
>>  }
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/