linux-kernel - Re: [debug patch] printk: Add a printk killswitch to robustify NMI watchdog messages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4DEBBFF9.2030101@die-jansens.de>
Date:	Sun, 05 Jun 2011 19:42:17 +0200
From:	Arne Jansen <lists@...-jansens.de>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	mingo@...hat.com, hpa@...or.com, linux-kernel@...r.kernel.org,
	efault@....de, npiggin@...nel.dk, akpm@...ux-foundation.org,
	frank.rowand@...sony.com, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org
Subject: Re: [debug patch] printk: Add a printk killswitch to robustify NMI
 watchdog messages

On 05.06.2011 19:20, Ingo Molnar wrote:
>
> * Arne Jansen<lists@...-jansens.de>  wrote:
>
>>>>> With the all-locks-printed output we can double check what locks are
>>>>> held.
>>
>> btw, the output posted earlier also contains some BUG: spinlock
>> lockup.
>
> hm, it's hard to interpret that without the spin_lock()/unlock()
> logic keeping the dumps apart.

The locking was in place from the beginning. As the output is still
scrambled, there are other sources for BUG/WARN outside the watchdog
that trigger in parallel. Maybe we should protect the whole BUG/WARN
mechanism with a lock and send it to early_printk from the beginning,
so we don't have to wait for the watchdog to kill printk off and the
first BUG can come through.
Or just let WARN/BUG kill off printk instead of the watchdog (though
I have to get rid of that syslog-WARN on startup).

>
> Was lockdep enabled as you started the test?

At least it was in the config, but haven't double checked. ATM, it is.

>
> but ... if the lock is reasonably sorted then it's this one:
>
> <0>BUG: spinlock lockup on CPU#3, modprobe/22211, ffffffff81e1c0c0
> Pid: 22211, comm: modprobe Tainted: G        W   2.6.39-rc3+ #19
> Call Trace:
>   [<ffffffff813af306>] do_raw_spin_lock+0x156/0x170
>   [<ffffffff8185ce71>] _raw_spin_lock+0x51/0x70
>   [<ffffffff81092df6>] ? vprintk+0x76/0x4a0
>   [<ffffffff81092df6>] vprintk+0x76/0x4a0
>   [<ffffffff810c5f8d>] ? trace_hardirqs_off+0xd/0x10
>   [<ffffffff81859e19>] printk+0x63/0x65
>   [<ffffffff813af301>] do_raw_spin_lock+0x151/0x170
>   [<ffffffff8108a4bd>] ? try_to_wake_up+0x29d/0x350
>   [<ffffffff8185ce71>] _raw_spin_lock+0x51/0x70
>   [<ffffffff81092df6>] ? vprintk+0x76/0x4a0
>   [<ffffffff81092df6>] vprintk+0x76/0x4a0
>   [<ffffffff8108758b>] ? cpuacct_charge+0x9b/0xb0
>   [<ffffffff8108750f>] ? cpuacct_charge+0x1f/0xb0
>   [<ffffffff8108a4bd>] ? try_to_wake_up+0x29d/0x350
>   [<ffffffff81859e19>] printk+0x63/0x65
>   [<ffffffff813af090>] spin_bug+0x70/0xf0
>   [<ffffffff813af2d9>] do_raw_spin_lock+0x129/0x170
>   [<ffffffff8108a4bd>] ? try_to_wake_up+0x29d/0x350
>   [<ffffffff8185ce71>] _raw_spin_lock+0x51/0x70
>   [<ffffffff81092df6>] ? vprintk+0x76/0x4a0
>
> and it occured before the lockup in the scheduler.
>
> Which could be due to a race between disabling lockdep on one CPU and
> the scheduler doing the lock-held check on another CPU.
>
> Do you get any messages after the assert is removed, during the test?

No.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/