[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALZhoSRyjRYVcKpq9cECHWgKkci6V+O_zn2CkiwqnFo57tZDfQ@mail.gmail.com>
Date: Wed, 6 Mar 2013 22:39:15 +0800
From: Lei Wen <adrian.wenl@...il.com>
To: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org, leiwen@...vell.com
Subject: Re: workqueue panic in 3.4 kernel
Hi Tejun
On Wed, Mar 6, 2013 at 12:32 AM, Tejun Heo <tj@...nel.org> wrote:
> Hello,
>
> On Tue, Mar 05, 2013 at 03:31:45PM +0800, Lei Wen wrote:
>> With checking memory, we find work->data becomes 0x300, when it try
>> to call get_work_cwq
>
> Why would that become 0x300? Who's writing to that memory? Nobody
> should be.
We find a race condition as below:
CPU0 CPU1
timer interrupt happen
__run_timers
__run_timers::spin_lock_irq(&base->lock)
__run_timers::spin_unlock_irq(&base->lock)
__cancel_work_timer
__cancel_work_timer::del_timer
__cancel_work_timer::wait_on_work
__cancel_work_timer::clear_work_data
__run_timers::call_timer_fn(timer, fn, data);
delayed_work_timer_fn::get_work_cwq
__run_timers::spin_lock_irq(&base->lock)
It is possible for __cancel_work_timer to be run over cpu1 __BEFORE__
cpu0 is ready to
run the timer callback, which is delayed_work_timer_fn in our case.
Although __cancel_work_timer would call wait_on_work to wait the
already inserted work
complete, but for the work is not queued yet for its calback is not
being executed, so
the result should be wait_on_work directly return, and clear_work_data
clears work->data.
Thus when delayed_work_timer_fn is called, it would see work->data as 0x300.
Do you think it is possible for this kind of sequence?
>
>> in delayed_work_timer_fn. Thus cwq becomes NULL before calls __queue_work.
>> So it is reasonable kernel get panic when it try to access wq with cwq->wq.
>>
>> To fix it, we try to backport below patches:
>> commit 60c057bca22285efefbba033624763a778f243bf
>> Author: Lai Jiangshan <laijs@...fujitsu.com>
>> Date: Wed Feb 6 18:04:53 2013 -0800
>>
>> workqueue: add delayed_work->wq to simplify reentrancy handling
>>
>> commit 1265057fa02c7bed3b6d9ddc8a2048065a370364
>> Author: Tejun Heo <tj@...nel.org>
>> Date: Wed Aug 8 09:38:42 2012 -0700
>>
>> workqueue: fix CPU binding of flush_delayed_work[_sync]()
>
> Neither should affect the problem you described above. It *could*
> make the problem go away just because it would stop using wq->data to
> record cwq if the corruption was contained to that field but that
> isn't a proper fix and the underlying problem could easily cause other
> issues.
>
>> And add below change to make sure __cancel_work_timer cannot preempt
>> between run_timer_softirq and delayed_work_timer_fn.
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index bf4888c..0e9f77c 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -2627,7 +2627,7 @@ static bool __cancel_work_timer(struct work_struct *work,
>> ret = (timer && likely(del_timer(timer)));
>> if (!ret)
>> ret = try_to_grab_pending(work);
>> - wait_on_work(work);
>> + flush_work(work);
>> } while (unlikely(ret < 0));
>>
>> clear_work_data(work);
>>
>> Do you think this fix is enough? And add flush_work directly in
>> __cancel_work_timer is ok for
>> the fix?
>
> Maybe I'm missing something but it looks like the root cause hasn't
> been diagnosed and you're piling up band-aids in workqueue code. You
> might get away with it but could also be making the problem just more
> obscure and difficult to debug (reproducible bugs are happy bugs).
>
> I'd suggest finding out who owns the delayed_work item and examining
> why the delayed_work is getting corrupted in the first place.
>
> Thanks.
>
> --
> tejun
Thanks,
Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists