lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <12a57ca6f77c8d95939014448de28659ea223e16.camel@kernel.org>
Date: Thu, 21 Mar 2024 07:20:45 -0400
From: Jeff Layton <jlayton@...nel.org>
To: linux-kernel <linux-kernel@...r.kernel.org>
Subject: crash in __pv_queued_spin_lock_slowpath while testing fstests
 generic/650

I set up a test run using a 6.9-pre kernel last night. 2 of the machines
crashed with similar stack traces:

[ 9204.447148] run fstests generic/650 at 2024-03-20 17:45:46
[ 9206.062145] smpboot: CPU 4 is now offline
[ 9207.149663] smpboot: Booting Node 0 Processor 4 APIC 0x4
[ 9207.151712] x86/cpu: User Mode Instruction Prevention (UMIP) activated
[ 9207.192900] stack segment: 0000 [#2] PREEMPT SMP PTI
[ 9207.194185] CPU: 4 PID: 36 Comm: cpuhp/4 Tainted: G      D            6.8.0-g9f4a1748ce19 #167
[ 9207.196044] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014
[ 9207.197806] RIP: 0010:__pv_queued_spin_lock_slowpath+0x27a/0x370
[ 9207.199145] Code: 89 df e8 e9 c5 5f ff e9 02 ff ff ff 83 e0 03 c1 ea 12 48 c1 e0 05 48 8d a8 40 22 03 00 8d 42 ff 48 98 48 03 2c c5 e0 c4 6c 92 <4c> 89 75 00 b8 00 80 00 00 eb 13 84 c0 75 08 0f b6 55 14 84 d2 75
[ 9207.202877] RSP: 0000:ffff9b6b4016fd18 EFLAGS: 00010086
[ 9207.203988] RAX: 0000000000001110 RBX: ffff8ae8a31c20f4 RCX: 0000000000000001
[ 9207.205044] RDX: 0000000000001111 RSI: 0000000000000000 RDI: ffffffff9261d546
[ 9207.206079] RBP: ff7914a8ff7c30b2 R08: ffffffff930fed20 R09: ffff8ae880910788
[ 9207.207114] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9b6b4016fda0
[ 9207.208162] R13: 0000000000140000 R14: ffff8ae9f7d32240 R15: 0000000000000000
[ 9207.209200] FS:  0000000000000000(0000) GS:ffff8ae9f7d00000(0000) knlGS:0000000000000000
[ 9207.210391] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9207.211273] CR2: 0000000000000000 CR3: 000000004e01a001 CR4: 0000000000770ef0
[ 9207.212355] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9207.213403] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9207.214440] PKRU: 55555554
[ 9207.214867] Call Trace:
[ 9207.215278]  <TASK>
[ 9207.215716]  ? die+0x32/0x80
[ 9207.216192]  ? do_trap+0xd9/0x100
[ 9207.216711]  ? do_error_trap+0x6a/0x90
[ 9207.217292]  ? exc_stack_segment+0x33/0x50
[ 9207.217941]  ? asm_exc_stack_segment+0x22/0x30
[ 9207.218625]  ? __pv_queued_spin_lock_slowpath+0x27a/0x370
[ 9207.219455]  ? __pv_queued_spin_lock_slowpath+0x6c/0x370
[ 9207.220442]  _raw_spin_lock_irqsave+0x44/0x50
[ 9207.221857]  task_rq_lock+0x29/0x100
[ 9207.222739]  ? __pfx_workqueue_online_cpu+0x10/0x10
[ 9207.223920]  __set_cpus_allowed_ptr+0x2d/0xa0
[ 9207.225103]  set_cpus_allowed_ptr+0x37/0x60
[ 9207.226241]  workqueue_online_cpu+0x242/0x320
[ 9207.227554]  ? __pfx_workqueue_online_cpu+0x10/0x10
[ 9207.228806]  cpuhp_invoke_callback+0xf5/0x450
[ 9207.229901]  ? __pfx_smpboot_thread_fn+0x10/0x10
[ 9207.231075]  cpuhp_thread_fun+0xe7/0x160
[ 9207.232002]  smpboot_thread_fn+0x184/0x220
[ 9207.233001]  kthread+0xda/0x110
[ 9207.233860]  ? __pfx_kthread+0x10/0x10
[ 9207.234840]  ret_from_fork+0x2d/0x50
[ 9207.235480]  ? __pfx_kthread+0x10/0x10
[ 9207.236090]  ret_from_fork_asm+0x1a/0x30
[ 9207.236725]  </TASK>
[ 9207.237116] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink sunrpc 9p netfs siw ib_uverbs ib_core kvm_intel joydev kvm virtio_net pcspkr psmouse 9pnet_virtio net_failover failover virtio_balloon button evdev loop drm dm_mod zram zsmalloc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 sha512_generic xfs sha256_ssse3 libcrc32c crc32c_generic sha1_ssse3 crc32c_intel nvme nvme_core t10_pi aesni_intel crc64_rocksoft_generic crypto_simd crc64_rocksoft cryptd crc64 virtio_blk virtio_console serio_raw virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring autofs4
[ 9207.247496] ---[ end trace 0000000000000000 ]---
[ 9207.248273] RIP: 0010:nfs_inode_find_state_and_recover+0x8e/0x250 [nfsv4]
[ 9207.249491] Code: 00 00 49 8b 47 40 a8 01 0f 85 a4 00 00 00 49 8b 85 80 00 00 00 4c 8d 68 80 48 39 c3 74 52 4d 8b 7d 60 4d 85 ff 74 e7 49 8b 0e <49> 39 4f 58 75 bd 41 8b 7e 08 41 39 7f 60 75 b3 41 8b 14 24 85 d2
[ 9207.252441] RSP: 0018:ffff9b6b4ed63d38 EFLAGS: 00010206
[ 9207.253336] RAX: ffff8ae8a0c1bd80 RBX: ffff8ae8b2773bd8 RCX: 96c8a28e65fafadb
[ 9207.254537] RDX: ffff8ae8a31c1840 RSI: ffff8ae88a855904 RDI: ffff8ae8b2773c58
[ 9207.255670] RBP: ffff8ae8b2773c58 R08: ffff8ae88005f428 R09: ffff8ae88a850910
[ 9207.256829] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8ae88a855904
[ 9207.257922] R13: ffff8ae8a0c1bd00 R14: ffff8ae88a855908 R15: 0000010000002000
[ 9207.259108] FS:  0000000000000000(0000) GS:ffff8ae9f7d00000(0000) knlGS:0000000000000000
[ 9207.260363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9207.261274] CR2: 0000000000000000 CR3: 000000004e01a001 CR4: 0000000000770ef0
[ 9207.262436] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9207.263598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9207.264795] PKRU: 55555554
[ 9207.265262] note: cpuhp/4[36] exited with irqs disabled
[ 9207.266096] note: cpuhp/4[36] exited with preempt_count 1

$ ./scripts/faddr2line --list vmlinux __pv_queued_spin_lock_slowpath+0x27a/0x370
__pv_queued_spin_lock_slowpath+0x27a/0x370:

__pv_queued_spin_lock_slowpath at kernel/locking/qspinlock.c:474 (discriminator 2)
 469 		 */
 470 		if (old & _Q_TAIL_MASK) {
 471 			prev = decode_tail(old);
 472 	
 473 			/* Link @node into the waitqueue. */
>474<			WRITE_ONCE(prev->next, node);
 475 	
 476 			pv_wait_node(node, prev);
 477 			arch_mcs_spin_lock_contended(&node->locked);
 478 	
 479 			/*


generic/650 offlines some CPUs, so that might be a factor here. The
kernel is based on commit a4145ce1e7bc ("Merge tag 'bcachefs-2024-03-19'
of https://evilpiepirate.org/git/bcachefs"), with some directory
delegation patches on top that I doubt are a factor.

Let me know if there's any other info you need!
-- 
Jeff Layton <jlayton@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ