[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+icZUVRnwFCC446xKy_dfZbnSftk7-e8ZVmT=_9zSvTj0Gzyw@mail.gmail.com>
Date: Mon, 14 Sep 2015 17:25:33 +0200
From: Sedat Dilek <sedat.dilek@...il.com>
To: Lai Jiangshan <jiangshanlai@...il.com>
Cc: Tejun Heo <tj@...nel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [Linux v4.2] workqueue: llvmlinux: acpid: BUG: sleeping function
called from invalid context at kernel/workqueue.c:2680
On Mon, Sep 14, 2015 at 4:00 PM, Sedat Dilek <sedat.dilek@...il.com> wrote:
> On Thu, Sep 10, 2015 at 3:04 AM, Lai Jiangshan <jiangshanlai@...il.com> wrote:
>> Hi, TJ
>>
>> I think we need to add might_sleep() on the top of __cancel_work_timer().
>> The might_sleep() on the start_flush_work() doesn't cover all the
>> paths of __cancel_work_timer().
>> And it can help to narrow the area of this bug.
>>
>> Hi Sedat Dilek
>>
>> [ 24.705704] irq event stamp: 19968
>> [ 24.705706] hardirqs last enabled at (19967): [<ffffffff81917ff2>]
>> _raw_spin_unlock_irq+0x32/0x60
>> [ 24.705713] hardirqs last disabled at (19968): [<ffffffff81120477>]
>> del_timer_sync+0x37/0x110
>>
>> Is it means the irq-disabled-event is leak by del_timer_sync()? It is
>> impossible.
>>
>> usbhid_close()
>> mutex_lock(); // it has might_sleep() check. So the problem seems to be
>> // hidden at one of the following statements
>> del_timer_sync();
>> cancel_work_sync();
>>
>
> With attached suggested patch the bug-line now looks like this...
>
> [ 22.604524] BUG: sleeping function called from invalid context at
> kernel/workqueue.c:2771
> [ 22.604539] in_atomic(): 0, irqs_disabled(): 1, pid: 1347, name: acpid
> [ 22.604546] 3 locks held by acpid/1347:
> [ 22.604547] #0: (&evdev->mutex){+.+...}, at: [<ffffffff8173bf0c>]
> evdev_release+0xbc/0xf0
> [ 22.604557] #1: (&dev->mutex#2){+.+...}, at: [<ffffffff81733677>]
> input_close_device+0x27/0x70
> [ 22.604566] #2: (hid_open_mut){+.+...}, at: [<ffffffffa0056378>]
> usbhid_close+0x28/0xb0 [usbhid]
> [ 22.604573] irq event stamp: 19874
> [ 22.604575] hardirqs last enabled at (19873): [<ffffffff81918562>]
> _raw_spin_unlock_irq+0x32/0x60
> [ 22.604579] hardirqs last disabled at (19874): [<ffffffff81120487>]
> del_timer_sync+0x37/0x110
> [ 22.604584] softirqs last enabled at (18852): [<ffffffff8189ed39>]
> local_bh_enable+0x9/0x20
> [ 22.604588] softirqs last disabled at (18850): [<ffffffff8189ed19>]
> local_bh_disable+0x9/0x20
> [ 22.604593] CPU: 3 PID: 1347 Comm: acpid Not tainted
> 4.2.0-6-llvmlinux-amd64 #1
> [ 22.604595] Hardware name: SAMSUNG ELECTRONICS CO., LTD.
> 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
> [ 22.604597] ffff8800d31ce948 0000000000000092 0000000000000000
> ffff880116e2fbb8
> [ 22.604601] ffffffff8149289d ffff880116e2fbe8 ffffffff810cbf8a
> ffffffff81c51a34
> [ 22.604605] ffff8800c65d5140 0000000000000000 0000000000000ad3
> ffff880116e2fc28
> [ 22.604608] Call Trace:
> [ 22.604612] [<ffffffff8149289d>] dump_stack+0x7d/0xa0
> [ 22.604617] [<ffffffff810cbf8a>] ___might_sleep+0x28a/0x2a0
> [ 22.604620] [<ffffffff810cbc8f>] __might_sleep+0x4f/0xc0
> [ 22.604623] [<ffffffff810aebae>] __cancel_work_timer+0x2e/0x270
> [ 22.604626] [<ffffffff81918502>] ? _raw_spin_unlock_irqrestore+0x52/0x80
> [ 22.604630] [<ffffffff8112043d>] ? try_to_del_timer_sync+0xad/0xc0
> [ 22.604632] [<ffffffff810aeb78>] cancel_work_sync+0x18/0x20
> [ 22.604636] [<ffffffffa00563c5>] usbhid_close+0x75/0xb0 [usbhid]
> [ 22.604641] [<ffffffffa0039431>] hidinput_close+0x31/0x40 [hid]
> [ 22.604645] [<ffffffffa0039400>] ? hidinput_open+0x40/0x40 [hid]
> [ 22.604648] [<ffffffff81733698>] input_close_device+0x48/0x70
> [ 22.604651] [<ffffffff8173bf26>] evdev_release+0xd6/0xf0
> [ 22.604655] [<ffffffff8126ed57>] __fput+0x107/0x240
> [ 22.604659] [<ffffffff8126ebe6>] ____fput+0x16/0x20
> [ 22.604662] [<ffffffff810b8117>] task_work_run+0x87/0x130
> [ 22.604667] [<ffffffff810173ef>] do_notify_resume+0x9cf/0xa00
> [ 22.604670] [<ffffffff810edded>] ? trace_hardirqs_on+0xd/0x10
> [ 22.604675] [<ffffffff811d1ce3>] ? context_tracking_user_enter+0x13/0x20
> [ 22.604678] [<ffffffff81029c31>] ? syscall_trace_leave+0x111/0x340
> [ 22.604681] [<ffffffff8126eb76>] ? fput+0x76/0xd0
> [ 22.604684] [<ffffffff8126b435>] ? filp_close+0x65/0x90
> [ 22.604688] [<ffffffff81003017>] ? trace_hardirqs_on_thunk+0x17/0x19
> [ 22.604691] [<ffffffff8191932e>] int_signal+0x12/0x17
>
> Full dmesg-log and my kernel-config are attached.
>
> Hope this helps.
>
My Xorg crashed several times - so this bug is reproducible.
- Sedat -
View attachment "dmesg_4.2.0-6-llvmlinux-amd64_xorg-crashed.txt" of type "text/plain" (66997 bytes)
Powered by blists - more mailing lists