[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B6A2D29.3010804@kernel.org>
Date: Thu, 04 Feb 2010 11:12:57 +0900
From: Tejun Heo <tj@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>
CC: Simon Kagstrom <simon.kagstrom@...insight.net>,
linux-kernel@...r.kernel.org, laijs@...fujitsu.com,
rusty@...tcorp.com.au, akpm@...ux-foundation.org, mingo@...e.hu
Subject: Re: [PATCH] core: workqueue: BUG_ON on workqueue recursion
Hello,
On 02/04/2010 04:43 AM, Oleg Nesterov wrote:
> On 02/03, Simon Kagstrom wrote:
>>
>> When the workqueue is flushed from workqueue context (recursively), the
>> system enters a strange state where things at random (dependent on the
>> global workqueue) start misbehaving. For example, for us the console and
>> logins locks up while the web server continues running.
>>
>> Since the system becomes unstable, change this to a BUG_ON instead.
>
> I agree with this patch. We are going to deadlock anyway, if the
> condition is true the caller is cwq->current_work, this means
> flush_cpu_workqueue() will insert the barrier and hang.
>
> However,
>
>> @@ -482,7 +482,7 @@ static int flush_cpu_workqueue(struct cpu_workqueue_struct *cwq)
>> int active = 0;
>> struct wq_barrier barr;
>>
>> - WARN_ON(cwq->thread == current);
>> + BUG_ON(cwq->thread == current);
>
> Another option is change the code to do
>
> if (WARN_ON(cwq->thread == current))
> return;
>
> This gives the kernel chance to survive after the warning.
>
> What do you think?
Yeah, I like this one better too. Even solely for debugging,
WARN_ON() is better as often users don't have reliable ways to gather
kernel log after a BUG_ON().
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists