lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1498728106.19484.21.camel@abdul>
Date:   Thu, 29 Jun 2017 14:51:46 +0530
From:   Abdul Haleem <abdhalee@...ux.vnet.ibm.com>
To:     linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>
Cc:     linux-next <linux-next@...r.kernel.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        npiggin <npiggin@...il.com>,
        sachinp <sachinp@...ux.vnet.ibm.com>, mpe <mpe@...erman.id>,
        paulus@...ba.org
Subject: [linux-next] cpus stalls detected few hours after booting next
 kernel

Hi,

On a PowerPC bare-metal machine running 4.12.0-rc7-next-20170628 kernel.

An hour after the kernel is booted, dmesg is flooded with CPU stalls
messages and NMI backtraces.

Tests: No Tests, just keep machine idle for few hours after boot.
Machine Type: Power 8 Bare-Metal
Kernel : 4.12.0-rc7-next-20170628
gcc: 4.8.5 20150623


Trace logs:
-----------

[ 4255.148372] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 4255.148446] 	0-...: (30267 GPs behind) idle=238/0/0 softirq=504/504 fqs=0 
[ 4255.148462] 	1-...: (665 GPs behind) idle=6f0/0/0 softirq=347/347 fqs=0 
[ 4255.148477] 	2-...: (409 GPs behind) idle=8ac/0/0 softirq=496/496 fqs=0 
[ 4255.148542] 	6-...: (27907 GPs behind) idle=808/0/0 softirq=2609/2610 fqs=0 
[ 4255.148618] 	8-...: (2672 GPs behind) idle=c78/0/0 softirq=338/338 fqs=0 
[ 4255.148682] 	9-...: (2229 GPs behind) idle=f18/0/0 softirq=432/432 fqs=0 
[ 4255.148745] 	10-...: (24370 GPs behind) idle=a34/0/0 softirq=797/798 fqs=0 
[ 4255.148809] 	11-...: (24870 GPs behind) idle=848/0/0 softirq=427/427 fqs=0 
[ 4255.148873] 	12-...: (24870 GPs behind) idle=11c/0/0 softirq=297/297 fqs=0 	
[ 4255.148936] 	13-...: (30549 GPs behind) idle=d44/0/0 softirq=278/278 fqs=0 
[ 4255.148999] 	14-...: (30555 GPs behind) idle=420/0/0 softirq=359/359 fqs=0 
[ 4255.149063] 	15-...: (30255 GPs behind) idle=1dc/0/0 softirq=416/416 fqs=0 
[ 4255.149127] 	16-...: (30531 GPs behind) idle=a58/0/0 softirq=275/275 fqs=0 
[ 4255.149191] 	18-...: (30531 GPs behind) idle=b14/0/0 softirq=310/310 fqs=0 
[ 4255.149254] 	19-...: (30558 GPs behind) idle=174/0/0 softirq=290/291 fqs=0 
[ 4255.149318] 	20-...: (30012 GPs behind) idle=0f0/0/0 softirq=495/495 fqs=0 
[ 4255.149382] 	21-...: (3782 GPs behind) idle=438/0/0 softirq=445/445 fqs=0 
[ 4255.149445] 	22-...: (29534 GPs behind) idle=b1c/0/0 softirq=361/361 fqs=0 
[ 4255.149508] 	23-...: (1019 GPs behind) idle=3d0/0/0 softirq=308/308 fqs=0 
[ 4255.149572] 	24-...: (30558 GPs behind) idle=ef0/0/0 softirq=643/643 fqs=0 
[ 4255.149636] 	25-...: (30556 GPs behind) idle=de8/0/0 softirq=495/495 fqs=0 
[ 4255.149699] 	26-...: (30531 GPs behind) idle=740/0/0 softirq=310/310 fqs=0 
[ 4255.149763] 	27-...: (30531 GPs behind) idle=5cc/0/0 softirq=304/304 fqs=0 
[ 4255.149826] 	28-...: (9032 GPs behind) idle=b9c/0/0 softirq=308/308 fqs=0 
[ 4255.149890] 	29-...: (30552 GPs behind) idle=fe8/0/0 softirq=304/304 fqs=0 
[ 4255.149953] 	30-...: (30531 GPs behind) idle=a3c/0/0 softirq=473/473 fqs=0 
[ 4255.150017] 	31-...: (30555 GPs behind) idle=cfc/0/0 softirq=308/308 fqs=0 
[ 4255.150081] 	32-...: (29496 GPs behind) idle=be4/0/0 softirq=268/268 fqs=0 
[ 4255.150144] 	34-...: (4884 GPs behind) idle=af8/0/0 softirq=333/333 fqs=0 
[ 4255.150208] 	35-...: (4885 GPs behind) idle=c84/0/0 softirq=6151/6151 fqs=0 
[ 4255.150283] 	36-...: (14337 GPs behind) idle=23c/0/0 softirq=946/946 fqs=0 
[ 4255.150347] 	37-...: (30220 GPs behind) idle=800/0/0 softirq=288/288 fqs=0 
[ 4255.150410] 	38-...: (30217 GPs behind) idle=068/0/0 softirq=332/332 fqs=0 
[ 4255.150474] 	39-...: (30078 GPs behind) idle=f04/0/0 softirq=270/270 fqs=0 
[ 4255.150538] 	40-...: (18 GPs behind) idle=d00/0/0 softirq=1235/1235 fqs=0 
[ 4255.150602] 	64-...: (209 GPs behind) idle=d74/0/0 softirq=2358/2358 fqs=0 
[ 4255.150665] 	65-...: (204 GPs behind) idle=9b0/0/0 softirq=2531/2531 fqs=0 
[ 4255.150729] 	66-...: (273 GPs behind) idle=bbc/0/0 softirq=2684/2684 fqs=0 
[ 4255.150793] 	69-...: (364 GPs behind) idle=b64/0/0 softirq=2801/2801 fqs=0 
[ 4255.150856] 	72-...: (30 GPs behind) idle=e28/0/0 softirq=1381/1381 fqs=0 
[ 4255.150920] 	77-...: (7161 GPs behind) idle=acc/0/0 softirq=7358/7358 fqs=0 
[ 4255.150996] 	78-...: (1 GPs behind) idle=b68/0/0 softirq=11962/11967 fqs=0 
[ 4255.151060] 	79-...: (119 GPs behind) idle=bf0/0/0 softirq=6186/6186 fqs=0 
[ 4255.151121] 	(detected by 5, t=2102 jiffies, g=30521, c=30520, q=1069)
[ 4255.151192] Sending NMI from CPU 5 to CPUs 0:
[ 4255.151246] NMI backtrace for cpu 0
[ 4255.151287] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc7-next-20170628 #2
[ 4255.151363] task: c0000007f8495600 task.stack: c0000007f842c000
[ 4255.151428] NIP: c00000000000adb4 LR: c000000000015584 CTR: c00000000082f4b0
[ 4255.151504] REGS: c0000007f842fb60 TRAP: 0e81   Not tainted  (4.12.0-rc7-next-20170628)
[ 4255.151578] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
[ 4255.151586]   CR: 22004884  XER: 00000000
[ 4255.151675] CFAR: c00000000062c108 SOFTE: 1 
[ 4255.151675] GPR00: c00000000082d6c8 c0000007f842fde0 c000000001062b00 0000000028000000 
[ 4255.151675] GPR04: 0000000000000003 c000000000089830 00003aa8056bc35f 0000000000000001 
[ 4255.151675] GPR08: 0000000000000002 c000000000d52d80 00000007fe7d0000 9000000000001003 
[ 4255.151675] GPR12: c00000000082a0c0 c00000000fd40000 
[ 4255.152217] NIP [c00000000000adb4] .L__replay_interrupt_return+0x0/0x4
[ 4255.152334] LR [c000000000015584] arch_local_irq_restore+0x74/0x90
[ 4255.152447] Call Trace:
[ 4255.152499] [c0000007f842fde0] [c00000000017cec0] tick_broadcast_oneshot_control+0x40/0x60 (unreliable)
[ 4255.152662] [c0000007f842fe00] [c00000000082d6c8] cpuidle_enter_state+0x108/0x3d0
[ 4255.152803] [c0000007f842fe60] [c000000000133e94] call_cpuidle+0x44/0x80
[ 4255.152921] [c0000007f842fe80] [c000000000134240] do_idle+0x290/0x2f0
[ 4255.153037] [c0000007f842fef0] [c000000000134474] cpu_startup_entry+0x34/0x40
[ 4255.153176] [c0000007f842ff20] [c000000000041944] start_secondary+0x304/0x360
[ 4255.153316] [c0000007f842ff90] [c00000000000b16c] start_secondary_prolog+0x10/0x14
[ 4255.153455] Instruction dump:
[ 4255.153527] 7d200026 618c8000 2c030900 4182e320 2c030500 4182dd68 2c030e80 4182ffa4 
[ 4255.153668] 2c030ea0 4182f078 2c030e60 4182edb0 <4e800020> 7c781b78 480003c9 480003e1 


Regard's
Abdul Haleem
IBM Linux Technology Center.

View attachment "dmesglogs.txt" of type "text/plain" (173129 bytes)

View attachment "Tul-NV-config" of type "text/plain" (86717 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ