linux-kernel - Re: linux-next: WARNING: at kernel/lockdep.c:2680 check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1212483192.6304.8.camel@twins>
Date:	Tue, 03 Jun 2008 10:53:12 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Kevin Winchester <kjwinchester@...il.com>
Cc:	Ingo Molnar <mingo@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: linux-next: WARNING: at kernel/lockdep.c:2680
	check_flags+0x98/0x151()

On Mon, 2008-06-02 at 20:47 -0300, Kevin Winchester wrote:
> In next-20080530 and next-20080602 (and possibly earlier - I can't 
> remember the linux-next tree before that I tried) I get the following:
> 
> [   12.885153] ------------[ cut here ]------------
> [   12.885203] WARNING: at kernel/lockdep.c:2680 check_flags+0x98/0x151()
> [   12.885248] Pid: 4, comm: watchdog/0 Not tainted 
> 2.6.26-rc4-next-20080602 #13
> [   12.885292]
> [   12.885293] Call Trace:
> [   12.885364]  [<ffffffff8022bbd5>] warn_on_slowpath+0x58/0x8a
> [   12.885410]  [<ffffffff804c9cfe>] ? _spin_unlock_irqrestore+0x51/0x6d
> [   12.885455]  [<ffffffff8032ff41>] ? debug_locks_off+0x9/0x3c
> [   12.885498]  [<ffffffff802582dd>] ? ftrace_record_ip+0x1fa/0x272
> [   12.885542]  [<ffffffff8025278a>] ? watchdog+0xc5/0x1ff
> [   12.885584]  [<ffffffff8020b2c0>] ? mcount_call+0x5/0x35
> [   12.885627]  [<ffffffff8025278a>] ? watchdog+0xc5/0x1ff
> [   12.885668]  [<ffffffff80247c80>] check_flags+0x98/0x151
> [   12.885710]  [<ffffffff8024ae72>] lock_acquire+0x4a/0xa9
> [   12.885753]  [<ffffffff8025278a>] ? watchdog+0xc5/0x1ff
> [   12.885795]  [<ffffffff802526c5>] ? watchdog+0x0/0x1ff
> [   12.885837]  [<ffffffff804c98da>] _read_lock+0x37/0x43
> [   12.885879]  [<ffffffff8025278a>] watchdog+0xc5/0x1ff
> [   12.885921]  [<ffffffff802526c5>] ? watchdog+0x0/0x1ff
> [   12.885963]  [<ffffffff8023e48b>] kthread+0x4e/0x7b
> [   12.886005]  [<ffffffff8020bf18>] child_rip+0xa/0x12
> [   12.886046]  [<ffffffff80227d8f>] ? finish_task_switch+0x57/0x92
> [   12.886090]  [<ffffffff804c9d55>] ? _spin_unlock_irq+0x3b/0x57
> [   12.886133]  [<ffffffff8020bad3>] ? restore_args+0x0/0x30
> [   12.886137]  [<ffffffff8023e43d>] ? kthread+0x0/0x7b
> [   12.886137]  [<ffffffff8020bf0e>] ? child_rip+0x0/0x12
> [   12.886137]
> [   12.886137] ---[ end trace 60e7f098a6913839 ]---
> [   12.886137] possible reason: unannotated irqs-on.
> [   12.886137] irq event stamp: 20
> [   12.886137] hardirqs last  enabled at (19): [<ffffffff80249cc3>] 
> trace_hardirqs_on+0xd/0xf
> [   12.886137] hardirqs last disabled at (20): [<ffffffff80248565>] 
> trace_hardirqs_off+0xd/0xf
> [   12.886137] softirqs last  enabled at (0): [<ffffffff80229fef>] 
> copy_process+0x2da/0x109e
> [   12.886137] softirqs last disabled at (0): [<0000000000000000>] 0x0
> 
> Do I understand this correctly that something enabled irqs in a way that 
> got around lockdep?  I assume the problem is not in watchdog, just that 
> the watchdog was the next thing to run that interacted with irqs and 
> thus lockdep picked up the situation then?
> 
> Is there additional debugging I can do, given some instructions?  If the 
> cause is readily apparent to anyone, could they let me know (for my own 
> interest) why it is apparent so that I can investigate things like this 
> further next time?


You are correct - someone did sti but failed to call
trace_hardirqs_on(). Frequently its possible to isolate the code from
knowing the last recorded event:

[   12.886137] irq event stamp: 20
[   12.886137] hardirqs last disabled at (20): [<ffffffff80248565>] trace_hardirqs_off+0xd/0xf

However your compilation seems to have lost the caller IP:

void trace_hardirqs_off(void)
{
        trace_hardirqs_off_caller(CALLER_ADDR0);
}
EXPORT_SYMBOL(trace_hardirqs_off);

So that is of little help here. (Not sure how that happened, nor how you
could fix that - perhaps turn on some debugging knobs like build with
debug info etc..)


> This is completely reproducible on every boot - should I try to bisect it?

That is a possibility yes - if you can find the offending patch it
should be relatively straight forward to find the offending sti.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/