lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080423.162311.118426680.davem@davemloft.net>
Date:	Wed, 23 Apr 2008 16:23:11 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	mingo@...e.hu
Cc:	linux-kernel@...r.kernel.org, tglx@...utronix.de,
	a.p.zijlstra@...llo.nl
Subject: Re: [patch] softlockup: fix false positives on nohz if CPU is 100%
 idle for more than 60 seconds

From: Ingo Molnar <mingo@...e.hu>
Date: Wed, 23 Apr 2008 15:36:56 +0200

> as a temporary workaround please try the patch below, until we can 
> reproduce and fix the bug.

Yeah, if you basically turn off the code paths, that particular set of
problems goes away :-/

So then we're at the next bug, cpus getting wedged in the group
aggregate code.

I'll try Peter's patches which were posted today.

[  760.218048] BUG: soft lockup - CPU#5 stuck for 61s! [swapper:0]
[  760.218292] TSTATE: 0000000080001603 TPC: 000000000054e0c0 TNPC: 000000000054e0c4 Y: 00000000    Not tainted
[  760.218325] TPC: <find_next_bit+0xe4/0x11c>
[  760.218336] g0: 0000000000009000 g1: 0000000000000000 g2: ffffffffffffffff g3: 0000000000000030
[  760.218352] g4: fffff803ff0d5880 g5: fffff80007c8a000 g6: fffff803ff0ec000 g7: 00000000007bb6d0
[  760.218368] o0: 000000000000fff0 o1: 0000000000000040 o2: 0000000000000034 o3: 0000000000000000
[  760.218383] o4: 0000000100009332 o5: 0000000000000000 sp: fffff803ff0eee21 ret_pc: 000000000054de08
[  760.218402] RPC: <__next_cpu+0x18/0x2c>
[  760.218413] l0: 00000000007f0000 l1: 0000009980001602 l2: 0000000000455d2c l3: 0000000000000400
[  760.218428] l4: 0000000000000000 l5: 0000000000000002 l6: 0000000000000000 l7: 0000000000000008
[  760.218443] i0: 0000000000000033 i1: 00000000007bb6c8 i2: 0000000000000038 i3: fffff803f73bf100
[  760.218459] i4: 0000000000845000 i5: 0000000000000401 i6: fffff803ff0eeee1 i7: 0000000000455d48
[  760.218487] I7: <aggregate_group_shares+0x10c/0x16c>
[  823.716459] INFO: task collect2:4106 blocked for more than 120 seconds.
[  823.716680] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  823.716815] collect2      D 00000000006b4a80     0  4106   4105
[  823.716831] Call Trace:
[  823.716839]  [00000000006b4c40] schedule_timeout+0x20/0xa4
[  823.716859]  [00000000006b4a80] wait_for_common+0xf4/0x184
[  823.716875]  [000000000045f2cc] do_fork+0x1dc/0x234
[  823.716894]  [0000000000406214] linux_sparc_syscall32+0x3c/0x40
[  823.716917]  [0000000000023f50] 0x23f58

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ