lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DEBE3DF.70104@die-jansens.de>
Date:	Sun, 05 Jun 2011 22:15:27 +0200
From:	Arne Jansen <lists@...-jansens.de>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	mingo@...hat.com, hpa@...or.com, linux-kernel@...r.kernel.org,
	efault@....de, npiggin@...nel.dk, akpm@...ux-foundation.org,
	frank.rowand@...sony.com, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org
Subject: Re: [debug patch] printk: Add a printk killswitch to robustify NMI
 watchdog messages

On 05.06.2011 21:44, Ingo Molnar wrote:
>
> * Arne Jansen<lists@...-jansens.de>  wrote:
>
>>  From the timing I see I'd guess it has something to do with the
>> scheduler kicking in during printk. I'm neither familiar with the
>> printk code nor with the scheduler.
>
> Yeah, that's the well-known wake-up of klogd:
>
> void console_unlock(void)
> {
> ...
>          up(&console_sem);
>
> actually ... that's not the klogd wake-up at all (!). I so suck today
> at bug analysis :-)
>
> It's the console lock()/unlock() sequence, and guess what does it:
>
>   drivers/tty/tty_io.c:   console_lock();
>   drivers/tty/vt/selection.c:     console_lock();
>
> and the vt.c code in a dozen places.
>
> So maybe it's some sort of tty related memory corruption that was
> made *visible* via the extra assert that the scheduler is doing? The
> pi_list is embedded in task struct.
>
> This would explain why only printk() triggers it and other wakeup
> patterns not.
>
> Now, i don't really like this theory either. Why is there no other
> type of corruption? And exactly why did only the task_struct::pi_lock
> field get corrupted while nearby fields not? Also, none of the fields
> near pi_lock are even remotely tty related.

Can lockdep just get confused by the lockdep_off/on calls in printk
while scheduling is allowed? There aren't many users of lockdep_off().

I'll can try again tomorrow to get a dump of all logs from the
watchdog, but enough for today...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ