[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+icZUWgABkGFWLwdMAqmYpLzNHnR4Qzg6k43Jno6X4GdL3Dpg@mail.gmail.com>
Date: Fri, 25 Sep 2015 14:13:43 +0200
From: Sedat Dilek <sedat.dilek@...il.com>
To: Jiri Kosina <jikos@...nel.org>
Cc: Tejun Heo <tj@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
Darren Hart <dvhart@...ux.intel.com>,
Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [Linux v4.2] workqueue: llvmlinux: acpid: BUG: sleeping function
called from invalid context at kernel/workqueue.c:2680
On Fri, Sep 25, 2015 at 2:00 PM, Jiri Kosina <jikos@...nel.org> wrote:
> On Thu, 24 Sep 2015, Sedat Dilek wrote:
>
>> >> >> > >> [ 24.705767] [<ffffffff8149287d>] dump_stack+0x7d/0xa0
>> >> >> > >> [ 24.705774] [<ffffffff810cbf7a>] ___might_sleep+0x28a/0x2a0
>> >> >> > >> [ 24.705779] [<ffffffff810cbc7f>] __might_sleep+0x4f/0xc0
>> >> >> > >> [ 24.705784] [<ffffffff810ae8ff>] start_flush_work+0x2f/0x290
>> >> >> > >> [ 24.705789] [<ffffffff810ae8ac>] flush_work+0x5c/0x80
>> >> >> > >> [ 24.705792] [<ffffffff810ae86a>] ? flush_work+0x1a/0x80
>> >> >> > >> [ 24.705799] [<ffffffff810eddcd>] ? trace_hardirqs_off+0xd/0x10
>> >> >> > >> [ 24.705804] [<ffffffff810ad938>] ? try_to_grab_pending+0x48/0x360
>> >> >> > >> [ 24.705810] [<ffffffff81917e13>] ? _raw_spin_lock_irqsave+0x73/0x80
>> >> >> > >> [ 24.705814] [<ffffffff810aecf9>] __cancel_work_timer+0x179/0x260
>> >> >>
>> >> >> This one is even more strange. It says that flush_work() is being called
>> >> >> from __cancel_work_timer() with IRQs disabled, but flags are explicitly
>> >> >> restored just one statement before that, and usbhid_close() explicitly
>> >> >> calls cancel_work_sync() after unconditionally enabling interrupts.
>> >> >>
>> >> >> So I am not able to make any sense of either of the traces really.
>> >> >>
>> >> >> Are you seeing this with the same .config with GCC-compiled kernel as
>> >> >> well?
>> >> >
>> >> > Actually could you please provide disassembly of your
>> >> > __cancel_work_timer()?
>> >> >
>> >>
>> >> Disassembly of which file - corresponding workqueue or hid file?
>> >
>> > make kernel/workqueue.o
>> > objdump -Dr kernel/workqueue.o
>> >
>> > and copy/paste output for __cancel_work_timer function.
>> >
>>
>> It was a bit long, so I attached the full output.
>
> 216a: ff 14 25 00 00 00 00 callq *0x0
> 216d: R_X86_64_32S pv_irq_ops+0x8
> 2171: 4c 89 ef mov %r13,%rdi
> 2174: e8 00 00 00 00 callq 2179 <__cancel_work_timer+0x189>
> 2175: R_X86_64_PC32 flush_work-0x4
>
> So the ordering generated by compiler is correct here.
>
> This however:
>
> [ 24.824639] hardirqs last enabled at (7913): [<ffffffff8192a2b2>] _raw_spin_unlock_irq+0x32/0x60
> [ 24.824646] hardirqs last disabled at (7914): [<ffffffff81120e37>] del_timer_sync+0x37/0x110
>
> combined with the stacktrace above, doesn't still make too much sense; IRQ
> disable/enable is properly paired in del_timer_sync() and timer base lock
> locking.
>
> Could you look into the generated assembly whether the sequence
>
> local_irq_save(flags);
> lock_map_acquire(&timer->lockdep_map);
> lock_map_release(&timer->lockdep_map);
> local_irq_restore(flags);
>
> in del_timer_sync() hasn't been optimized in some damaging way?
>
I am looking into my CLANG objdump...
$ egrep -nr 'save|restore|acquire|release'
objdump-Dr_kernel-workqueue_o_CLANG-3-7.txt | egrep 'irq|map'
5718: 4601: R_X86_64_PC32 _raw_spin_unlock_irqrestore-0x4
5766: 4699: R_X86_64_PC32 _raw_spin_lock_irqsave-0x4
6173: 4bd9: R_X86_64_PC32 _raw_spin_lock_irqsave-0x4
6265: 4d05: R_X86_64_PC32 _raw_spin_unlock_irqrestore-0x4
...this is what I have.
Does that help?
- Sedat -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists