lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <45f1ee63-793d-2a13-90cc-2312b17dffaa@opensource.wdc.com>
Date:   Fri, 7 Oct 2022 10:05:42 +0900
From:   Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To:     Jeff Layton <jlayton@...hat.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-ide <linux-ide@...r.kernel.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: soft lockup with pre-rc1

On 10/7/22 02:33, Jeff Layton wrote:
> I pulled down Linus tree today and did some testing with some filesystem
> related patches. Running xfstests against ext4 in a VM, and I hit this:
> 
> [ 8947.343179] run fstests generic/224 at 2022-10-06 13:24:20
> [ 8948.435322] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Quota mode: none.
> [ 8949.235839] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Quota mode: none.
> [ 9017.622065] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/2:0:1145882]
> [ 9023.009230] Modules linked in: ext4(E) mbcache(E) jbd2(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) ip_set(E) nf_tables(E) nfnetlink(E) iTCO_wdt(E) intel_pmc_bxt(E) iTCO_vendor_support(E) joydev(E) intel_rapl_msr(E) i2c_i801(E) virtio_balloon(E) i2c_smbus(E) lpc_ich(E) intel_rapl_common(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) fuse(E) sunrpc(E) zram(E) xfs(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) virtio_gpu(E) virtio_dma_buf(E) drm_shmem_helper(E) serio_raw(E) virtio_net(E) drm_kms_helper(E) virtio_blk(E) net_failover(E) virtio_console(E) failover(E) drm(E) qemu_fw_cfg(E)
> [ 9023.081530] CPU: 2 PID: 1145882 Comm: kworker/2:0 Tainted: G            E      6.0.0+ #43
> [ 9023.085919] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
> [ 9023.090747] Workqueue: events_freezable_power_ disk_events_workfn
> [ 9023.095155] RIP: 0010:__do_softirq+0xb5/0x401
> [ 9023.147050] Code: ff e8 6f db bc fe c7 44 24 1c 0a 00 00 00 48 c7 c7 c0 d2 87 ad e8 1b 9c be ff 65 66 c7 05 d1 0e a4 52 00 00 fb 0f 1f 44 00 00 <bb> ff ff ff ff 41 0f bc de 83 c3 01 0f 84 9a 00 00 00 48 c7 c5 c0
> [ 9023.158173] RSP: 0018:ffff888221909f80 EFLAGS: 00000296
> [ 9023.163217] RAX: 0000000000000002 RBX: ffff888107fb002c RCX: ffffffffac1cdc2d
> [ 9023.168569] RDX: dffffc0000000000 RSI: ffffffffad87d2c0 RDI: ffffffffada7f7e0
> [ 9023.173885] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffffafc01083
> [ 9023.179097] R10: fffffbfff5f80210 R11: 0000000000000000 R12: 0000000000000000
> [ 9023.184780] R13: 0000000000000000 R14: 0000000000000282 R15: ffff888107fb0000
> [ 9023.189955] FS:  0000000000000000(0000) GS:ffff888221900000(0000) knlGS:0000000000000000
> [ 9023.195329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 9023.200380] CR2: 00007fba16a314e0 CR3: 0000000107882000 CR4: 00000000003506e0
> [ 9023.205569] Call Trace:
> [ 9023.209571]  <IRQ>
> [ 9023.213587]  __irq_exit_rcu+0x12a/0x190
> [ 9023.217664]  sysvec_apic_timer_interrupt+0x8a/0xb0
> [ 9023.221863]  </IRQ>
> [ 9023.225684]  <TASK>
> [ 9023.229384]  asm_sysvec_apic_timer_interrupt+0x16/0x20
> [ 9023.233485] RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40
> [ 9023.237740] Code: c6 48 89 df e8 d8 54 fe fe 90 eb cc 0f 1f 44 00 00 0f 1f 44 00 00 e8 5a 4e fe fe 90 f7 c6 00 02 00 00 74 06 fb 0f 1f 44 00 00 <bf> 01 00 00 00 e8 ad 9a f8 fe 65 8b 05 16 a8 e2 52 85 c0 74 05 e9
> [ 9023.246961] RSP: 0018:ffff88810f9e7828 EFLAGS: 00000206
> [ 9023.251106] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffffacbdecee
> [ 9023.255535] RDX: dffffc0000000000 RSI: 0000000000000286 RDI: ffff888109e4d500
> [ 9023.259915] RBP: ffff88810a9d8000 R08: 0000000000000001 R09: ffff88810aa8894b
> [ 9023.264310] R10: ffffed1021551129 R11: fefefefefefefeff R12: 0000000000000286
> [ 9023.268834] R13: ffff88810a9d8010 R14: ffff8881013e0000 R15: ffff88810ab5a000
> [ 9023.273239]  ? ata_scsi_queuecmd+0x6e/0xb0
> [ 9023.277360]  ? _raw_spin_unlock_irqrestore+0xa/0x40
> [ 9023.281479]  ata_scsi_queuecmd+0x7a/0xb0
> [ 9023.285386]  scsi_queue_rq+0xb5c/0x13c0
> [ 9023.289449]  blk_mq_dispatch_rq_list+0x3e2/0xfc0
> [ 9023.293549]  ? blk_insert_cloned_request+0x330/0x330
> [ 9023.297646]  ? _raw_spin_lock+0x77/0xc0
> [ 9023.301593]  ? _raw_spin_lock_bh+0xc0/0xc0
> [ 9023.305485]  __blk_mq_sched_dispatch_requests+0x13b/0x1f0
> [ 9023.309609]  ? blk_mq_do_dispatch_ctx+0x340/0x340
> [ 9023.313581]  ? _raw_spin_lock_bh+0xc0/0xc0
> [ 9023.317304]  blk_mq_sched_dispatch_requests+0x82/0xb0
> [ 9023.320780]  __blk_mq_run_hw_queue+0x66/0xe0
> [ 9023.324965]  blk_mq_sched_insert_request+0x16f/0x1d0
> [ 9023.328294]  ? __alloc_pages_slowpath.constprop.0+0x1360/0x1360
> [ 9023.331757]  ? blk_mq_sched_bio_merge+0x180/0x180
> [ 9023.335040]  ? bio_add_pc_page+0x8a/0xb0
> [ 9023.339093]  blk_execute_rq+0x111/0x2c0
> [ 9023.343823]  ? blk_mq_get_sq_hctx+0x80/0x80
> [ 9023.348523]  __scsi_execute+0x13a/0x2b0
> [ 9023.353163]  sr_check_events+0x13e/0x420
> [ 9023.357071]  ? sr_packet+0x90/0x90
> [ 9023.359914]  ? __mod_timer+0x42a/0x5e0
> [ 9023.362727]  ? hibernation_snapshot+0x3e8/0x640
> [ 9023.365600]  ? console_emit_next_record.constprop.0+0x400/0x400
> [ 9023.368659]  cdrom_check_events+0x34/0x70
> [ 9023.371522]  disk_check_events+0x5d/0x180
> [ 9023.374430]  process_one_work+0x3d3/0x6a0
> [ 9023.377304]  worker_thread+0x8a/0x610
> [ 9023.380329]  ? process_one_work+0x6a0/0x6a0
> [ 9023.383202]  kthread+0x167/0x1a0
> [ 9023.385889]  ? kthread_complete_and_exit+0x20/0x20
> [ 9023.388668]  ret_from_fork+0x22/0x30
> [ 9023.391338]  </TASK>

Adding linux-block and linux-fsdevel, just in case.

Hmm... Weird. ATA changes for 6.1 are not merged yet, and none of them
touch ata_scsi_queue_cmd() or port locking anyway.

It looks like the ata lock is stuck, which may mean that some other
context is stuck in completion path ? Do you see any other task stuck in
ata code ?

> 
> Last commit in my tree is this:
> 
> 833477fce7a1 (origin/master) Merge tag 'sound-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> 
> I also have some filesystem related patches on top of this. inode-
>> i_version patches mostly -- nothing that should effect things at this
> level.
> 
> config is attached.

-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ