lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.00.1509251347500.30132@pobox.suse.cz>
Date:	Fri, 25 Sep 2015 14:00:14 +0200 (CEST)
From:	Jiri Kosina <jikos@...nel.org>
To:	Sedat Dilek <sedat.dilek@...il.com>
cc:	Tejun Heo <tj@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
	Darren Hart <dvhart@...ux.intel.com>,
	Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [Linux v4.2] workqueue: llvmlinux: acpid: BUG: sleeping function
 called from invalid context at kernel/workqueue.c:2680

On Thu, 24 Sep 2015, Sedat Dilek wrote:

> >> >> > >> [   24.705767]  [<ffffffff8149287d>] dump_stack+0x7d/0xa0
> >> >> > >> [   24.705774]  [<ffffffff810cbf7a>] ___might_sleep+0x28a/0x2a0
> >> >> > >> [   24.705779]  [<ffffffff810cbc7f>] __might_sleep+0x4f/0xc0
> >> >> > >> [   24.705784]  [<ffffffff810ae8ff>] start_flush_work+0x2f/0x290
> >> >> > >> [   24.705789]  [<ffffffff810ae8ac>] flush_work+0x5c/0x80
> >> >> > >> [   24.705792]  [<ffffffff810ae86a>] ? flush_work+0x1a/0x80
> >> >> > >> [   24.705799]  [<ffffffff810eddcd>] ? trace_hardirqs_off+0xd/0x10
> >> >> > >> [   24.705804]  [<ffffffff810ad938>] ? try_to_grab_pending+0x48/0x360
> >> >> > >> [   24.705810]  [<ffffffff81917e13>] ? _raw_spin_lock_irqsave+0x73/0x80
> >> >> > >> [   24.705814]  [<ffffffff810aecf9>] __cancel_work_timer+0x179/0x260
> >> >>
> >> >> This one is even more strange. It says that flush_work() is being called
> >> >> from __cancel_work_timer() with IRQs disabled, but flags are explicitly
> >> >> restored just one statement before that, and usbhid_close() explicitly
> >> >> calls cancel_work_sync() after unconditionally enabling interrupts.
> >> >>
> >> >> So I am not able to make any sense of either of the traces really.
> >> >>
> >> >> Are you seeing this with the same .config with GCC-compiled kernel as
> >> >> well?
> >> >
> >> > Actually could you please provide disassembly of your
> >> > __cancel_work_timer()?
> >> >
> >>
> >> Disassembly of which file - corresponding workqueue or hid file?
> >
> >         make kernel/workqueue.o
> >         objdump -Dr kernel/workqueue.o
> >
> > and copy/paste output for __cancel_work_timer function.
> >
> 
> It was a bit long, so I attached the full output.

    216a:       ff 14 25 00 00 00 00    callq  *0x0
                        216d: R_X86_64_32S      pv_irq_ops+0x8
    2171:       4c 89 ef                mov    %r13,%rdi
    2174:       e8 00 00 00 00          callq  2179 <__cancel_work_timer+0x189>
                        2175: R_X86_64_PC32     flush_work-0x4

So the ordering generated by compiler is correct here.

This however:

[   24.824639] hardirqs last  enabled at (7913): [<ffffffff8192a2b2>] _raw_spin_unlock_irq+0x32/0x60
[   24.824646] hardirqs last disabled at (7914): [<ffffffff81120e37>] del_timer_sync+0x37/0x110

combined with the stacktrace above, doesn't still make too much sense; IRQ 
disable/enable is properly paired in del_timer_sync() and timer base lock 
locking.

Could you look into the generated assembly whether the sequence

        local_irq_save(flags);
        lock_map_acquire(&timer->lockdep_map);
        lock_map_release(&timer->lockdep_map);
        local_irq_restore(flags);

in del_timer_sync() hasn't been optimized in some damaging way?

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ