lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <8d1ebe64-f5df-43d4-8e4d-20f934daff45@linux.vnet.ibm.com>
Date: Mon, 10 Feb 2025 10:04:29 +0530
From: Venkat Rao Bagalkote <venkat88@...ux.vnet.ibm.com>
To: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Cc: sfr@...b.auug.org.au
Subject: [linux-next][next-20250207]Observing Kernel Softlock up's while
 running kselftest

Greetings!!!

I am observing kernel soft lock up's while running kselftest on IBM 
Power Servers.

Though, I colud not reporduce this consistently, but CI has detected 
this error twice now. Hence reporting.

This error was reported firat time, while running signal component tests 
and second time while running EEH component.

linux-next/tools/testing/selftests/powerpc/signal

linux-next/tools/testing/selftests/powerpc/eeh



Traces:

[11480.019928] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! 
[swapper/0:0]
[11480.019935] Modules linked in: nvram(E) rpadlpar_io(E) rpaphp(E) 
dm_mod(E) bonding(E) tls(E) nft_fib_inet(E) nft_fib_ipv4(E) 
nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) 
nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) 
nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) ip_set(E) 
nf_tables(E) nfnetlink(E) hvcs(E) pseries_rng(E) hvcserver(E) 
vmx_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) lpfc(E) 
sr_mod(E) sd_mod(E) cdrom(E) sg(E) nvmet_fc(E) ibmvscsi(E) nvmet(E) 
ibmveth(E) scsi_transport_srp(E) nvme_fc(E) nvme_fabrics(E) bnx2x(E) 
nvme_core(E) be2net(E) mdio(E) scsi_transport_fc(E) fuse(E) [last 
unloaded: test_cpuidle_latency(OE)]
[11480.019990] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Kdump: loaded 
Tainted: G           OE      6.14.0-rc1-next-20250207 #1
[11480.019995] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[11480.019996] Hardware name: IBM,8375-42A POWER9 (architected) 0x4e0202 
0xf000005 of:IBM,FW950.80 (VL950_131) hv:phyp pSeries
[11480.019997] NIP:  c00000000003a2d0 LR: c00000000003a644 CTR: 
c0000000002a912c
[11480.020000] REGS: c0000003bffffb28 TRAP: 0900   Tainted: G           
OE       (6.14.0-rc1-next-20250207)
[11480.020002] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 
22042442  XER: 20040000
[11480.020009] CFAR: 0000000000000000 IRQMASK: 0
[11480.020009] GPR00: c00000000003a644 c0000003bffffb00 c000000001667500 
c0000003bffffaf8
[11480.020009] GPR04: c000000004062940 c0000003bffffd20 0000000000000001 
c000000002277ca0
[11480.020009] GPR08: 0000000000000003 0000000000000049 0000000000000000 
0000000000002000
[11480.020009] GPR12: c0000000002a912c c000000003000000 0000000000000000 
0000000000000000
[11480.020009] GPR16: 0000000000000001 0000000000000082 0000000000000001 
0000000000000100
[11480.020009] GPR20: 0000000004200002 0000000000000000 0000000000000000 
0000000100110511
[11480.020009] GPR24: 7fffffffffffffff 0000000000000001 00000003bd5a0000 
0000000000000000
[11480.020009] GPR28: 0000000000000002 0000000000000003 fcffffffffffffff 
fcffffffffffffff
[11480.020036] NIP [c00000000003a2d0] __replay_soft_interrupts+0x5c/0x22c
[11480.020048] LR [c00000000003a644] arch_local_irq_restore+0x1a4/0x280
[11480.020053] Call Trace:
[11480.020054] [c0000003bffffb00] [c00000000003a358] 
__replay_soft_interrupts+0xe4/0x22c (unreliable)
[11480.020060] [c0000003bffffcb0] [c00000000003a644] 
arch_local_irq_restore+0x1a4/0x280
[11480.020064] [c0000003bffffcf0] [c0000000002a9d60] 
tmigr_handle_remote_cpu+0x24c/0x318
[11480.020071] [c0000003bffffda0] [c0000000002aa034] 
tmigr_handle_remote_up+0x208/0x2d0
[11480.020075] [c0000003bffffe10] [c0000000002a7d34] 
__walk_groups.isra.0+0x6c/0x100
[11480.020079] [c0000003bffffe50] [c0000000002aa2d0] 
tmigr_handle_remote+0xf0/0x170
[11480.020083] [c0000003bffffed0] [c0000000002876a4] 
run_timer_softirq+0x54/0x68
[11480.020089] [c0000003bffffef0] [c000000000179128] 
handle_softirqs+0x148/0x3b4
[11480.020094] [c0000003bfffffe0] [c000000000017f30] 
do_softirq_own_stack+0x3c/0x50
[11480.020100] [c000000002c87900] [c000000000178688] 
__irq_exit_rcu+0x18c/0x1b4
[11480.020102] [c000000002c87930] [c000000000179758] irq_exit+0x20/0x38
[11480.020105] [c000000002c87950] [c00000000002b004] 
timer_interrupt+0x128/0x300
[11480.020108] [c000000002c879b0] [c000000000009ffc] 
decrementer_common_virt+0x28c/0x290
[11480.020113] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[11480.020119] NIP:  c0000000000fb9d4 LR: c0000000010c2348 CTR: 
0000000000000000
[11480.020120] REGS: c000000002c879e0 TRAP: 0900   Tainted: G           
OE       (6.14.0-rc1-next-20250207)
[11480.020122] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  
CR: 22000248  XER: 20040000
[11480.020129] CFAR: 0000000000000000 IRQMASK: 0
[11480.020129] GPR00: 0000000000000000 c000000002c87c80 c000000001667500 
0000000000000000
[11480.020129] GPR04: 000000000000ffff 0000000000000000 0000000000000000 
0000000000000000
[11480.020129] GPR08: 0000000000000000 0000000000000000 80000000c7a3fc00 
ffffffffffffffff
[11480.020129] GPR12: 0000000000000000 c000000003000000 0000000000000000 
0000000000000000
[11480.020129] GPR16: 0000000000000000 0000000000000000 0000000000000000 
0000000000000000
[11480.020129] GPR20: 0000000000c00000 0000000000000008 0000000000000000 
0000000000000000
[11480.020129] GPR24: 0000000000000000 0000000000000000 00000a6adcf558a4 
0000000000000000
[11480.020129] GPR28: 0000000000000000 0000000000000001 c0000000022618e0 
c0000000022618e8
[11480.020155] NIP [c0000000000fb9d4] plpar_hcall_norets_notrace+0x18/0x2c
[11480.020158] LR [c0000000010c2348] check_and_cede_processor+0x48/0x5c
[11480.020162] --- interrupt: 900
[11480.020163] [c000000002c87c80] [c00000000028a8b0] 
__hrtimer_start_range_ns+0x160/0x2ec (unreliable)
[11480.020168] [c000000002c87ce0] [c0000000010c2790] 
dedicated_cede_loop+0x94/0x1a0
[11480.020171] [c000000002c87d30] [c0000000010c1d80] 
cpuidle_enter_state+0x3b4/0x5b4
[11480.020174] [c000000002c87dd0] [c000000000cac55c] cpuidle_enter+0x4c/0x68
[11480.020178] [c000000002c87e10] [c0000000001eb5b4] call_cpuidle+0x4c/0x94
[11480.020184] [c000000002c87e30] [c0000000001f3798] 
cpuidle_idle_call+0x164/0x240
[11480.020188] [c000000002c87e90] [c0000000001f3974] do_idle+0x100/0x1ac
[11480.020192] [c000000002c87ee0] [c0000000001f3ca4] 
cpu_startup_entry+0x48/0x50
[11480.020196] [c000000002c87f10] [c000000000011280] rest_init+0xf0/0xf4
[11480.020199] [c000000002c87f40] [c000000002006604] 
start_kernel+0x50c/0x5e0
[11480.020204] [c000000002c87fe0] [c00000000000ea9c] 
start_here_common+0x1c/0x20
[11480.020207] Code: 71298000 408201ec 892d0933 7d2a48f8 554a07fe 
0b0a0000 792ad7e2 0b0a0000 61290040 38610028 992d0933 480421c9 
<60000000> 39200000 e9410130 f9210160


Regards,

Venkat.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ