lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 05 Jun 2011 11:43:03 +0200
From:	Arne Jansen <lists@...-jansens.de>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	mingo@...hat.com, hpa@...or.com, linux-kernel@...r.kernel.org,
	efault@....de, npiggin@...nel.dk, akpm@...ux-foundation.org,
	frank.rowand@...sony.com, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org
Subject: Re: [tip:sched/locking] sched: Add p->pi_lock to task_rq_lock()

On 05.06.2011 10:17, Ingo Molnar wrote:
>
> * Peter Zijlstra<peterz@...radead.org>  wrote:
>
>> On Fri, 2011-06-03 at 12:02 +0200, Arne Jansen wrote:
>>> On 03.06.2011 11:15, Peter Zijlstra wrote:
>>
>>>> Anyway, Arne, how long did you wait before power cycling the box? The
>>>> NMI watchdog should trigger in about a minute or so if it will trigger
>>>> at all (its enabled in your config).
>>>
>>> No, it doesn't trigger,
>>
>> Bummer.
>
> Is there no output even when the console is configured to do an
> earlyprintk? That will allow the NMI watchdog to punch through even a
> printk or scheduler lockup.
>
> Arne, you can turn this on via one of these:
>
>    earlyprintk=vga,keep
>    earlyprintk=serial,ttyS0,115200,keep

My grub conf looks like this now:
kernel /boot/vmlinuz-2.6.39-rc3+ root=LABEL=label panic=15 
console=ttyS0,9600 earlyprintk=serial,ttyS0,9600,keep quiet

>
> (the ',keep' portion is important to have it active even after the
> regular console has been switched on.)
>
> Could you also please check with the (untested) patch below applied?
> This will turn off *all* printk done by the NMI watchdog and switches
> it to do pure early_printk() - which does not use any locking so it
> should never lock up.
>
> [ If you keep seeing 'NMI watchdog tick' messages periodically
>    occuring after the lockup then i'll send a more complete patch that
>    shuts off the regular printk path and makes sure that all output is
>    early_printk() based only. ]
>
> earlyprintk=,keep with such a patch has let me down only on the
> rarest of occasions.
>
> ( Arne, please also double check on a working bootup that the NMI
>    watchdog is actually ticking, by checking the NMI counts in
>    /proc/interrupts go up slowly but surely on all CPUs. )

It does, but _very_ slowly. Some CPUs do not count up for tens of
minutes if the machine is idle. If I generate some load like 'make
tags', the counters go up quite quickly.
After 4 minutes and one 'make cscope' it looks like this:
NMI:          8         13         43          5          2          3 
        22          1   Non-maskable interrupts

But I never see a single tick on console or in dmesg, even when I
replace the early_printk with a printk.

Btw, I get one warn on boot, but it look irrelevant to me:
[   36.064321] ------------[ cut here ]------------
[   36.064328] WARNING: at kernel/printk.c:293 do_syslog+0xbf/0x550()
[   36.064330] Hardware name: X8SIL
[   36.064331] Attempt to access syslog with CAP_SYS_ADMIN but no 
CAP_SYSLOG (deprecated).
[   36.064333] Modules linked in: mpt2sas scsi_transport_sas raid_class
[   36.064338] Pid: 21625, comm: syslog-ng Not tainted 2.6.39-rc3+ #8
[   36.064340] Call Trace:
[   36.064344]  [<ffffffff81091f7a>] warn_slowpath_common+0x7a/0xb0
[   36.064347]  [<ffffffff81092051>] warn_slowpath_fmt+0x41/0x50
[   36.064351]  [<ffffffff8109d8a5>] ? ns_capable+0x25/0x60
[   36.064354]  [<ffffffff8109365f>] do_syslog+0xbf/0x550
[   36.064358]  [<ffffffff810c9575>] ? lock_release_holdtime+0x35/0x170
[   36.064362]  [<ffffffff811e17a7>] kmsg_open+0x17/0x20
[   36.064366]  [<ffffffff811d5f46>] proc_reg_open+0xa6/0x180
[   36.064368]  [<ffffffff811e1790>] ? kmsg_release+0x20/0x20
[   36.064371]  [<ffffffff811e1770>] ? read_vmcore+0x1d0/0x1d0
[   36.064374]  [<ffffffff811d5ea0>] ? proc_fill_super+0xb0/0xb0
[   36.064378]  [<ffffffff811790bb>] __dentry_open+0x15b/0x330
[   36.064382]  [<ffffffff8185d6e6>] ? _raw_spin_unlock+0x26/0x30
[   36.064385]  [<ffffffff81179379>] nameidata_to_filp+0x69/0x80
[   36.064388]  [<ffffffff81187a3a>] do_last+0x1da/0x840
[   36.064391]  [<ffffffff81188fdb>] path_openat+0xcb/0x3f0
[   36.064394]  [<ffffffff810ba5c5>] ? sched_clock_cpu+0xc5/0x100
[   36.064397]  [<ffffffff8118944a>] do_filp_open+0x7a/0xa0
[   36.064400]  [<ffffffff8185d6e6>] ? _raw_spin_unlock+0x26/0x30
[   36.064402]  [<ffffffff81196c12>] ? alloc_fd+0xf2/0x140
[   36.064405]  [<ffffffff8117a3d2>] do_sys_open+0x102/0x1e0
[   36.064408]  [<ffffffff8117a4db>] sys_open+0x1b/0x20
[   36.064412]  [<ffffffff81864dbb>] system_call_fastpath+0x16/0x1b
[   36.064414] ---[ end trace df959c735174f5f7 ]---


-Arne

>
> Thanks,
>
> 	Ingo
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ