[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251228183129.17193-1-bbaa@bbaa.fun>
Date: Mon, 29 Dec 2025 02:31:24 +0800
From: Ban ZuoXiang <bbaa@...a.fun>
To: aliceryhl@...gle.com, gregkh@...uxfoundation.org
Cc: ojeda@...nel.org, alex.gaynor@...il.com, linux-mm@...ck.org,
rust-for-linux@...r.kernel.org, stable@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: [BUG] soft lockup in kswapd0 caused by Rust binder
Hello,
Many users [1][2][3] have reported a kernel soft lockup in the
kswapd0 task when running Waydroid (an Android container solution) on
kernels with the new Rust Binder driver.
The issue manifests as a soft lockup where CPU utilization is pegged at
100% system time, stuck in the list_lru_walk path triggered by the Rust
binder's shrinker.
Kernel Log:
12 25 01:23:57 arch-laptop kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kswapd0:142]
12 25 01:23:57 arch-laptop kernel: CPU#0 Utilization every 4000ms during lockup:
12 25 01:23:57 arch-laptop kernel: #1: 100% system, 0% softirq, 1% hardirq, 0% idle
12 25 01:23:57 arch-laptop kernel: #2: 100% system, 0% softirq, 1% hardirq, 0% idle
12 25 01:23:57 arch-laptop kernel: #3: 100% system, 0% softirq, 1% hardirq, 0% idle
12 25 01:23:57 arch-laptop kernel: #4: 100% system, 0% softirq, 1% hardirq, 0% idle
12 25 01:23:57 arch-laptop kernel: #5: 100% system, 0% softirq, 1% hardirq, 0% idle
12 25 01:23:57 arch-laptop kernel: Modules linked in: sch_ingress af_key tcp_diag udp_diag inet_diag nfnetlink_log xfrm_user xfrm_algo xfrm_interface xfrm6_tunnel tunnel4 tunnel6 vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci veth overlay loop nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables tun rfcomm snd_seq_dummy snd_hrtimer snd_seq cmac algif_hash algif_skcipher af_alg bnep vfat fat iwlmvm intel_rapl_msr amd_atl amdgpu intel_rapl_common mac80211 ptp pps_core libarc4 amdxcp snd_hda_codec_nvhdmi drm_panel_backlight_quirks snd_hda_codec_hdmi gpu_sched snd_usb_audio drm_buddy iwlwifi kvm_amd drm_exec uvcvideo drm_suballoc_helper snd_hda_intel drm_ttm_helper btusb kvm videobuf2_vmalloc spd5118 r8169 snd_hda_codec snd_usbmidi_lib ttm ucsi_acpi btmtk uvc snd_hda_core cfg80211 realtek btrtl irqbypass i2c_algo_bit snd_ump videobuf2_memops asus_nb_wmi typec_ucsi snd_intel_dspcfg sp5100_tco mdio_devres polyval_clmulni videobuf2_v4l2 btbcm amd_pmf
12 25 01:23:57 arch-laptop kernel: ghash_clmulni_intel snd_rawmidi drm_display_helper snd_intel_sdw_acpi asus_wmi libphy typec videobuf2_common i2c_piix4 btintel aesni_intel snd_hwdep snd_seq_device amdtee hid_multitouch bluetooth videodev cec wmi_bmof sparse_keymap rapl pcspkr roles ccp k10temp rfkill video i2c_smbus mdio_bus amd_sfh snd_pcm thunderbolt i2c_hid_acpi platform_profile snd_timer wmi i2c_hid tee snd amd_pmc soundcore mc mousedev joydev mac_hid tcp_bbr sch_fq_pie sch_pie i2c_dev pkcs8_key_parser ntsync crypto_user nfnetlink hid_logitech_hidpp hid_logitech_dj nvme nvme_core nvme_keyring nvme_auth hkdf serio_raw
12 25 01:23:57 arch-laptop kernel: CPU: 0 UID: 0 PID: 142 Comm: kswapd0 Not tainted 6.18.2-zen2-1-zen #1 PREEMPT(full) 817688afc19ca15a22737742591535351aba70f8
12 25 01:23:57 arch-laptop kernel: Hardware name: ASUSTeK COMPUTER INC. ASUS TUF Gaming A15 FA507RM_FA507RM/FA507RM, BIOS FA507RM.315 11/30/2022
12 25 01:23:57 arch-laptop kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x67/0x2e0
12 25 01:23:57 arch-laptop kernel: Code: 0f 92 c2 8b 01 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 0f 87 1e 02 00 00 85 c0 74 10 0f b6 01 84 c0 74 09 f3 90 0f b6 01 <84> c0 75 f7 b8 01 00 00 00 66 89 01 65 48 ff 05 ad af ed 01 c3 cc
12 25 01:23:57 arch-laptop kernel: RSP: 0018:ffffd4c4c06e3a30 EFLAGS: 00000202
12 25 01:23:57 arch-laptop kernel: RAX: 0000000000000001 RBX: ffffd4c4c06e3b10 RCX: ffff8d69446aa698
12 25 01:23:57 arch-laptop kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8d69446aa698
12 25 01:23:57 arch-laptop kernel: RBP: ffffffff94118f38 R08: 0000000000000000 R09: 0000000000000000
12 25 01:23:57 arch-laptop kernel: R10: ffff8d6fa1e38340 R11: ffff8d6fbe2d6000 R12: ffff8d6c5aaf8000
12 25 01:23:57 arch-laptop kernel: R13: ffffffff91d15410 R14: ffff8d69446aa680 R15: ffff8d69446aa680
12 25 01:23:57 arch-laptop kernel: FS: 0000000000000000(0000) GS:ffff8d700deaf000(0000) knlGS:0000000000000000
12 25 01:23:57 arch-laptop kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
12 25 01:23:57 arch-laptop kernel: CR2: 00000000100af568 CR3: 000000020da24000 CR4: 0000000000f50ef0
12 25 01:23:57 arch-laptop kernel: PKRU: 55555554
12 25 01:23:57 arch-laptop kernel: Call Trace:
12 25 01:23:57 arch-laptop kernel: <TASK>
12 25 01:23:57 arch-laptop kernel: _raw_spin_lock+0x29/0x30
12 25 01:23:57 arch-laptop kernel: __list_lru_walk_one.constprop.0+0x94/0x1d0
12 25 01:23:57 arch-laptop kernel: ? __pfx_rust_shrink_free_page_wrap+0x10/0x10
12 25 01:23:57 arch-laptop kernel: ? __pfx_rust_shrink_free_page_wrap+0x10/0x10
12 25 01:23:57 arch-laptop kernel: list_lru_walk_node+0x46/0x1f0
12 25 01:23:57 arch-laptop kernel: ? __pfx_rust_shrink_free_page_wrap+0x10/0x10
12 25 01:23:57 arch-laptop kernel: rust_helper_list_lru_walk+0x9d/0xe0
12 25 01:23:57 arch-laptop kernel: do_shrink_slab+0x140/0x350
12 25 01:23:57 arch-laptop kernel: shrink_slab+0xd7/0x3e0
12 25 01:23:57 arch-laptop kernel: shrink_one+0xfe/0x1d0
12 25 01:23:57 arch-laptop kernel: shrink_node+0xb4a/0xd60
12 25 01:23:57 arch-laptop kernel: ? pgdat_balanced+0x83/0x140
12 25 01:23:57 arch-laptop kernel: kswapd+0x870/0x1100
12 25 01:23:57 arch-laptop kernel: ? __switch_to+0x103/0x3f0
12 25 01:23:57 arch-laptop kernel: ? __pfx_kswapd+0x10/0x10
12 25 01:23:57 arch-laptop kernel: kthread+0xfc/0x240
12 25 01:23:57 arch-laptop kernel: ? __pfx_kthread+0x10/0x10
12 25 01:23:57 arch-laptop kernel: ret_from_fork+0x1c2/0x1f0
12 25 01:23:57 arch-laptop kernel: ? __pfx_kthread+0x10/0x10
12 25 01:23:57 arch-laptop kernel: ret_from_fork_asm+0x1a/0x30
12 25 01:23:57 arch-laptop kernel: </TASK>
rust/helpers/binder.c:
unsigned long rust_helper_list_lru_walk(struct list_lru *lru,
list_lru_walk_cb isolate, void *cb_arg,
unsigned long nr_to_walk)
{
return list_lru_walk(lru, isolate, cb_arg, nr_to_walk);
}
It appears that there exists a patch addressing this issue:
'rust: binder: stop spinning in shrinker' [4]
I have tested this patch, and it appears to resolve the soft lockup
issue.
Could this patch be picked up to fix the regression?
[1] https://github.com/waydroid/waydroid/issues/2163
[2] https://bbs.archlinux.org/viewtopic.php?id=311223
[3] https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/174
[4] https://lore.kernel.org/all/20251202-binder-shrink-unspin-v1-1-263efb9ad625@google.com/
regards,
Ban ZuoXiang
Powered by blists - more mailing lists