[<prev] [next>] [day] [month] [year] [list]
Message-ID: <tencent_96252FF6CE27E9F41F13AC73CCC1BE350905@qq.com>
Date: Thu, 11 Dec 2025 16:08:13 +0800
From: wujing <realwujing@...com>
To: Jason Gunthorpe <jgg@...pe.ca>,
Leon Romanovsky <leon@...nel.org>
Cc: linux-rdma@...r.kernel.org,
linux-kernel@...r.kernel.org,
wujing <realwujing@...com>,
Qiliang Yuan <yuanql9@...natelecom.cn>
Subject: [PATCH] IB/core: Fix ABBA deadlock in rdma_dev_exit_net
Fix an ABBA deadlock between rdma_dev_exit_net() and rdma_dev_init_net()
that causes massive processes stuck in D state and triggers soft lockup.
The problem was discovered in production environment running stress-ng
with network namespace operations. After 120+ seconds, multiple processes
got stuck and eventually triggered a soft lockup on CPU, leading to system
panic.
Full kernel log trace from the production crash:
[32754.001139] INFO: task kworker/u256:1:1700886 blocked for more than 120 seconds.
[32754.008609] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32754.016498] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32754.024972] task:kworker/u256:1 state:D stack:0 pid:1700886 ppid:2 flags:0x00000208
[32754.034077] Workqueue: netns cleanup_net
[32754.043234] Call trace:
[32754.052459] __switch_to+0x170/0x238
[32754.062013] __schedule+0x428/0xa08
[32754.071633] schedule+0x58/0x130
[32754.081301] schedule_preempt_disabled+0x18/0x30
[32754.091252] rwsem_down_write_slowpath+0x2a4/0x880
[32754.101419] down_write+0x60/0x78
[32754.111732] rdma_dev_exit_net+0x60/0x1d8 [ib_core]
[32754.122500] ops_exit_list+0x4c/0x90
[32754.133311] cleanup_net+0x2ac/0x580
[32754.144266] process_one_work+0x170/0x3c0
[32754.155451] worker_thread+0x22c/0x4d0
[32754.166775] kthread+0xf8/0x128
[32754.178219] ret_from_fork+0x10/0x20
[32754.229887] INFO: task stress-ng-clone:1848460 blocked for more than 121 seconds.
[32754.242302] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32754.255156] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32754.268609] task:stress-ng-clone state:D stack:0 pid:1848460 ppid:1705870 flags:0x0000020c
[32754.282744] Call trace:
[32754.296845] __switch_to+0x170/0x238
[32754.311182] __schedule+0x428/0xa08
[32754.325699] schedule+0x58/0x130
[32754.340345] schedule_preempt_disabled+0x18/0x30
[32754.355259] rwsem_down_read_slowpath+0x188/0x670
[32754.370341] down_read+0x38/0xd8
[32754.385557] rdma_dev_init_net+0x120/0x210 [ib_core]
[32754.401216] ops_init+0x80/0x160
[32754.416952] setup_net+0x114/0x338
[32754.432814] copy_net_ns+0x144/0x310
[32754.448829] create_new_namespaces+0x108/0x360
[32754.465123] unshare_nsproxy_namespaces+0x68/0xb8
[32754.481661] ksys_unshare+0x124/0x3f8
[32754.498367] __arm64_sys_unshare+0x1c/0x38
[32754.515280] invoke_syscall+0x50/0x128
[32754.532337] el0_svc_common.constprop.0+0xc8/0xf0
[32754.549706] do_el0_svc+0x24/0x38
[32754.567213] el0_svc+0x50/0x1e0
[32754.584822] el0t_64_sync_handler+0x100/0x130
[32754.602699] el0t_64_sync+0x1a4/0x1a8
[32754.622898] INFO: task stress-ng-clone:1855770 blocked for more than 121 seconds.
[32754.641630] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32754.660796] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32754.680588] task:stress-ng-clone state:D stack:0 pid:1855770 ppid:1703005 flags:0x0000020c
[32754.701003] Call trace:
[32754.721401] __switch_to+0x170/0x238
[32754.742070] __schedule+0x428/0xa08
[32754.762820] schedule+0x58/0x130
[32754.783656] schedule_preempt_disabled+0x18/0x30
[32754.804827] rwsem_down_read_slowpath+0x188/0x670
[32754.826210] down_read+0x38/0xd8
[32754.847677] rdma_dev_init_net+0x120/0x210 [ib_core]
[32754.869601] ops_init+0x80/0x160
[32754.890747] setup_net+0x114/0x338
[32754.912072] copy_net_ns+0x144/0x310
[32754.933567] create_new_namespaces+0x108/0x360
[32754.955403] unshare_nsproxy_namespaces+0x68/0xb8
[32754.977480] ksys_unshare+0x124/0x3f8
[32754.999696] __arm64_sys_unshare+0x1c/0x38
[32755.022211] invoke_syscall+0x50/0x128
[32755.044865] el0_svc_common.constprop.0+0xc8/0xf0
[32755.067857] do_el0_svc+0x24/0x38
[32755.091009] el0_svc+0x50/0x1e0
[32755.113669] el0t_64_sync_handler+0x100/0x130
[32755.136195] el0t_64_sync+0x1a4/0x1a8
[32755.158514] INFO: task stress-ng-clone:1856643 blocked for more than 121 seconds.
[32755.180811] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32755.203035] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32755.225684] task:stress-ng-clone state:D stack:0 pid:1856643 ppid:1703079 flags:0x0000020c
[32755.248867] Call trace:
[32755.271902] __switch_to+0x170/0x238
[32755.295058] __schedule+0x428/0xa08
[32755.318173] schedule+0x58/0x130
[32755.341211] schedule_preempt_disabled+0x18/0x30
[32755.364281] rwsem_down_read_slowpath+0x188/0x670
[32755.387320] down_read+0x38/0xd8
[32755.410218] rdma_dev_init_net+0x120/0x210 [ib_core]
[32755.433439] ops_init+0x80/0x160
[32755.456537] setup_net+0x114/0x338
[32755.479597] copy_net_ns+0x144/0x310
[32755.502674] create_new_namespaces+0x108/0x360
[32755.525888] unshare_nsproxy_namespaces+0x68/0xb8
[32755.548885] ksys_unshare+0x124/0x3f8
[32755.571533] __arm64_sys_unshare+0x1c/0x38
[32755.593903] invoke_syscall+0x50/0x128
[32755.615804] el0_svc_common.constprop.0+0xc8/0xf0
[32755.637511] do_el0_svc+0x24/0x38
[32755.659193] el0_svc+0x50/0x1e0
[32755.680845] el0t_64_sync_handler+0x100/0x130
[32755.702648] el0t_64_sync+0x1a4/0x1a8
[32755.724966] INFO: task stress-ng-clone:1857557 blocked for more than 122 seconds.
[32755.747272] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32755.769740] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32755.792562] task:stress-ng-clone state:D stack:0 pid:1857557 ppid:1704397 flags:0x0000020c
[32755.815790] Call trace:
[32755.838868] __switch_to+0x170/0x238
[32755.862070] __schedule+0x428/0xa08
[32755.885171] schedule+0x58/0x130
[32755.908174] schedule_preempt_disabled+0x18/0x30
[32755.931239] rwsem_down_read_slowpath+0x188/0x670
[32755.954317] down_read+0x38/0xd8
[32755.977330] rdma_dev_init_net+0x120/0x210 [ib_core]
[32756.000549] ops_init+0x80/0x160
[32756.023585] setup_net+0x114/0x338
[32756.046639] copy_net_ns+0x144/0x310
[32756.069664] create_new_namespaces+0x108/0x360
[32756.092850] unshare_nsproxy_namespaces+0x68/0xb8
[32756.115819] ksys_unshare+0x124/0x3f8
[32756.138451] __arm64_sys_unshare+0x1c/0x38
[32756.160814] invoke_syscall+0x50/0x128
[32756.182721] el0_svc_common.constprop.0+0xc8/0xf0
[32756.204411] do_el0_svc+0x24/0x38
[32756.226090] el0_svc+0x50/0x1e0
[32756.247750] el0t_64_sync_handler+0x100/0x130
[32756.269569] el0t_64_sync+0x1a4/0x1a8
[32756.291600] INFO: task stress-ng-clone:1858428 blocked for more than 123 seconds.
[32756.313908] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32756.336373] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32756.359199] task:stress-ng-clone state:D stack:0 pid:1858428 ppid:1705594 flags:0x0000020c
[32756.382466] Call trace:
[32756.405568] __switch_to+0x170/0x238
[32756.428780] __schedule+0x428/0xa08
[32756.451891] schedule+0x58/0x130
[32756.474900] schedule_preempt_disabled+0x18/0x30
[32756.497974] rwsem_down_read_slowpath+0x188/0x670
[32756.521035] down_read+0x38/0xd8
[32756.544056] rdma_dev_init_net+0x120/0x210 [ib_core]
[32756.567272] ops_init+0x80/0x160
[32756.590318] setup_net+0x114/0x338
[32756.613377] copy_net_ns+0x144/0x310
[32756.636399] create_new_namespaces+0x108/0x360
[32756.659576] unshare_nsproxy_namespaces+0x68/0xb8
[32756.682534] ksys_unshare+0x124/0x3f8
[32756.705186] __arm64_sys_unshare+0x1c/0x38
[32756.727548] invoke_syscall+0x50/0x128
[32756.749445] el0_svc_common.constprop.0+0xc8/0xf0
[32756.771143] do_el0_svc+0x24/0x38
[32756.792793] el0_svc+0x50/0x1e0
[32756.814425] el0t_64_sync_handler+0x100/0x130
[32756.836214] el0t_64_sync+0x1a4/0x1a8
[32756.858417] INFO: task stress-ng-clone:1859786 blocked for more than 123 seconds.
[32756.880761] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32756.903208] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32756.926018] task:stress-ng-clone state:D stack:0 pid:1859786 ppid:1703833 flags:0x0000020c
[32756.949236] Call trace:
[32756.972318] __switch_to+0x170/0x238
[32756.995526] __schedule+0x428/0xa08
[32757.018612] schedule+0x58/0x130
[32757.041608] schedule_preempt_disabled+0x18/0x30
[32757.064675] rwsem_down_read_slowpath+0x188/0x670
[32757.087750] down_read+0x38/0xd8
[32757.110779] rdma_dev_init_net+0x120/0x210 [ib_core]
[32757.134014] ops_init+0x80/0x160
[32757.157037] setup_net+0x114/0x338
[32757.180100] copy_net_ns+0x144/0x310
[32757.203140] create_new_namespaces+0x108/0x360
[32757.226329] unshare_nsproxy_namespaces+0x68/0xb8
[32757.249304] ksys_unshare+0x124/0x3f8
[32757.271940] __arm64_sys_unshare+0x1c/0x38
[32757.294288] invoke_syscall+0x50/0x128
[32757.316214] el0_svc_common.constprop.0+0xc8/0xf0
[32757.337905] do_el0_svc+0x24/0x38
[32757.359561] el0_svc+0x50/0x1e0
[32757.381189] el0t_64_sync_handler+0x100/0x130
[32757.402989] el0t_64_sync+0x1a4/0x1a8
[32757.425586] INFO: task stress-ng-clone:1862292 blocked for more than 124 seconds.
[32757.447864] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32757.470299] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32757.493106] task:stress-ng-clone state:D stack:0 pid:1862292 ppid:1707297 flags:0x0000020c
[32757.516329] Call trace:
[32757.539411] __switch_to+0x170/0x238
[32757.562597] __schedule+0x428/0xa08
[32757.585708] schedule+0x58/0x130
[32757.608704] schedule_preempt_disabled+0x18/0x30
[32757.631753] rwsem_down_read_slowpath+0x188/0x670
[32757.654791] down_read+0x38/0xd8
[32757.677767] rdma_dev_init_net+0x120/0x210 [ib_core]
[32757.700941] ops_init+0x80/0x160
[32757.723941] setup_net+0x114/0x338
[32757.746951] copy_net_ns+0x144/0x310
[32757.769933] create_new_namespaces+0x108/0x360
[32757.793053] unshare_nsproxy_namespaces+0x68/0xb8
[32757.815941] ksys_unshare+0x124/0x3f8
[32757.838533] __arm64_sys_unshare+0x1c/0x38
[32757.860831] invoke_syscall+0x50/0x128
[32757.882673] el0_svc_common.constprop.0+0xc8/0xf0
[32757.904313] do_el0_svc+0x24/0x38
[32757.925917] el0_svc+0x50/0x1e0
[32757.947487] el0t_64_sync_handler+0x100/0x130
[32757.969220] el0t_64_sync+0x1a4/0x1a8
[32757.991241] INFO: task stress-ng-clone:1862471 blocked for more than 124 seconds.
[32758.013463] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32758.035857] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32758.058617] task:stress-ng-clone state:D stack:0 pid:1862471 ppid:1705665 flags:0x0000020c
[32758.081778] Call trace:
[32758.104771] __switch_to+0x170/0x238
[32758.127885] __schedule+0x428/0xa08
[32758.150892] schedule+0x58/0x130
[32758.173799] schedule_preempt_disabled+0x18/0x30
[32758.196773] rwsem_down_read_slowpath+0x188/0x670
[32758.219734] down_read+0x38/0xd8
[32758.242653] rdma_dev_init_net+0x120/0x210 [ib_core]
[32758.265798] ops_init+0x80/0x160
[32758.288731] setup_net+0x114/0x338
[32758.311709] copy_net_ns+0x144/0x310
[32758.334641] create_new_namespaces+0x108/0x360
[32758.357750] unshare_nsproxy_namespaces+0x68/0xb8
[32758.380629] ksys_unshare+0x124/0x3f8
[32758.403188] __arm64_sys_unshare+0x1c/0x38
[32758.425459] invoke_syscall+0x50/0x128
[32758.447288] el0_svc_common.constprop.0+0xc8/0xf0
[32758.468920] do_el0_svc+0x24/0x38
[32758.490517] el0_svc+0x50/0x1e0
[32758.512085] el0t_64_sync_handler+0x100/0x130
[32758.533800] el0t_64_sync+0x1a4/0x1a8
[32758.556548] INFO: task stress-ng-clone:1866684 blocked for more than 125 seconds.
[32758.578796] Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[32758.601184] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[32758.623945] task:stress-ng-clone state:D stack:0 pid:1866684 ppid:1704388 flags:0x0000020c
[32758.647123] Call trace:
[32758.670159] __switch_to+0x170/0x238
[32758.693295] __schedule+0x428/0xa08
[32758.716341] schedule+0x58/0x130
[32758.739291] schedule_preempt_disabled+0x18/0x30
[32758.762297] rwsem_down_read_slowpath+0x188/0x670
[32758.785305] down_read+0x38/0xd8
[32758.808267] rdma_dev_init_net+0x120/0x210 [ib_core]
[32758.831428] ops_init+0x80/0x160
[32758.854385] setup_net+0x114/0x338
[32758.877386] copy_net_ns+0x144/0x310
[32758.900337] create_new_namespaces+0x108/0x360
[32758.923472] unshare_nsproxy_namespaces+0x68/0xb8
[32758.946378] ksys_unshare+0x124/0x3f8
[32758.968961] __arm64_sys_unshare+0x1c/0x38
[32758.991256] invoke_syscall+0x50/0x128
[32759.013080] el0_svc_common.constprop.0+0xc8/0xf0
[32759.034750] do_el0_svc+0x24/0x38
[32759.056358] el0_svc+0x50/0x1e0
[32759.077935] el0t_64_sync_handler+0x100/0x130
[32759.099678] el0t_64_sync+0x1a4/0x1a8
[32759.121308] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
[33047.476663] hrtimer: interrupt took 41202 ns
[33077.887371] sched: DL replenish lagged too much
[33315.344633] sched: RT throttling activated
[33341.279179] watchdog: BUG: soft lockup - CPU#108 stuck for 22s! [stress-ng-cpu-s:396764]
[33341.413642] Modules linked in: binfmt_misc xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c bridge stp llc bonding rfkill sunrpc vfat fat ipmi_si phytium_dc_drm ipmi_devintf drm_display_helper ipmi_msghandler ses cec enclosure drm_kms_helper cppc_cpufreq sg drm i2c_core fuse nfnetlink ext4 jbd2 dm_multipath mpt3sas(O) raid_class scsi_transport_sas mlx5_ib(O) macsec ib_uverbs(O) ib_core(O) sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 crct10dif_ce ghash_ce sm4_ce_gcm sm4_ce_ccm sm4_ce sm4_ce_cipher sm4 sm3_ce sha3_ce sha512_ce ahci sha512_arm64 sha2_ce libahci sha256_arm64 sha1_ce sbsa_gwdt megaraid_sas(O) libata mlx5_core(O) dm_mirror dm_region_hash dm_log dm_mod mlxfw(O) psample mlxdevm(O) mlx_compat(O) tls pci_hyperv_intf ngbe(O) aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher
[33342.020187] CPU: 108 PID: 396764 Comm: stress-ng-cpu-s Kdump: loaded Tainted: G W O 6.6.0-0006.ctl4.aarch64 #1
[33342.204035] Hardware name: SuperCloud R2227/FT5000C, BIOS KL4.2A.CY.S.029.240626.R 06/26/2024 16:26:27
[33342.389751] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[33342.574945] pc : print_cpu+0x2d4/0x6d8
[33342.749605] lr : print_cpu+0x2ec/0x6d8
[33342.909203] sp : ffff80010240bba0
[33343.072975] x29: ffff80010240bba0 x28: 0000000000200000 x27: 0000000000000000
[33343.249328] x26: ffff80008136e630 x25: ffff80008136eaa8 x24: 0000000000000000
[33343.424992] x23: ffff800082045980 x22: ffff800084098540 x21: ffff61071ae86380
[33343.586300] x20: ffff510509cb0000 x19: ffff510509cb0000 x18: ffffffffffffffff
[33343.749402] x17: 2d2d2d2d2d2d2d2d x16: 2d2d2d2d2d2d2d2d x15: ffff80010240b8d0
[33343.920636] x14: 0000000000000000 x13: ffff6107415d0491 x12: 2d2d2d2d2d2d2d2d
[33344.071637] x11: 0000000000000000 x10: 000000000000000a x9 : ffff80010240ba80
[33344.205707] x8 : 000000000000000a x7 : 00000000ffffffd0 x6 : 000000000000000a
[33344.335930] x5 : ffff6107415d0495 x4 : 00000000001d0495 x3 : ffff6101a03acc10
[33344.471238] x2 : 000000000000005e x1 : ffff510509cb0a30 x0 : ffff51060473e890
[33344.596109] Call trace:
[33344.719296] print_cpu+0x2d4/0x6d8
[33344.842759] sched_debug_show+0x28/0x58
[33344.956976] seq_read_iter+0x168/0x478
[33345.062917] seq_read+0xa4/0xe8
[33345.154201] full_proxy_read+0x68/0xc8
[33345.276436] vfs_read+0xb8/0x1f8
[33345.405545] ksys_read+0x7c/0x120
[33345.542316] __arm64_sys_read+0x24/0x38
[33345.705965] invoke_syscall+0x50/0x128
[33345.887140] el0_svc_common.constprop.0+0xc8/0xf0
[33346.061783] do_el0_svc+0x24/0x38
[33346.208799] el0_svc+0x50/0x1e0
[33346.333700] el0t_64_sync_handler+0x100/0x130
[33346.480876] el0t_64_sync+0x1a4/0x1a8
[33346.626028] Kernel panic - not syncing: softlockup: hung tasks
[33346.762219] CPU: 108 PID: 396764 Comm: stress-ng-cpu-s Kdump: loaded Tainted: G W O L 6.6.0-0006.ctl4.aarch64 #1
[33346.909029] Hardware name: SuperCloud R2227/FT5000C, BIOS KL4.2A.CY.S.029.240626.R 06/26/2024 16:26:27
[33347.052070] Call trace:
[33347.222863] dump_backtrace+0xa0/0x128
[33347.373365] show_stack+0x20/0x38
[33347.494054] dump_stack_lvl+0x78/0xc8
[33347.619071] dump_stack+0x18/0x28
[33347.743973] panic+0x35c/0x3f8
[33347.874043] watchdog_timer_fn+0x21c/0x2a8
[33348.014973] __hrtimer_run_queues+0x15c/0x378
[33348.150149] hrtimer_interrupt+0x10c/0x348
[33348.276630] arch_timer_handler_phys+0x34/0x58
[33348.388360] handle_percpu_devid_irq+0x90/0x1c8
[33348.492041] handle_irq_desc+0x48/0x68
[33348.593527] generic_handle_domain_irq+0x24/0x38
[33348.696771] gic_handle_irq+0x1c0/0x380
[33348.791382] call_on_irq_stack+0x24/0x30
[33348.878987] do_interrupt_handler+0x88/0x98
[33348.960444] el1_interrupt+0x54/0x120
[33349.023389] el1h_64_irq_handler+0x24/0x30
[33349.083663] el1h_64_irq+0x78/0x80
[33349.146750] print_cpu+0x2d4/0x6d8
[33349.209473] sched_debug_show+0x28/0x58
[33349.266322] seq_read_iter+0x168/0x478
[33349.328919] seq_read+0xa4/0xe8
[33349.392488] full_proxy_read+0x68/0xc8
[33349.460141] vfs_read+0xb8/0x1f8
[33349.528925] ksys_read+0x7c/0x120
[33349.595094] __arm64_sys_read+0x24/0x38
[33349.685944] invoke_syscall+0x50/0x128
[33349.782633] el0_svc_common.constprop.0+0xc8/0xf0
[33349.900634] do_el0_svc+0x24/0x38
[33350.010436] el0_svc+0x50/0x1e0
[33350.123291] el0t_64_sync_handler+0x100/0x130
[33350.242707] el0t_64_sync+0x1a4/0x1a8
[33350.356508] SMP: stopping secondary CPUs
[33351.100301] Starting crashdump kernel...
[33351.120225] Bye!
Root cause analysis:
Classic ABBA deadlock due to inconsistent lock ordering between
rdma_dev_exit_net() and rdma_dev_init_net():
Thread A (cleanup_net workqueue -> kworker/u256:1):
rdma_dev_exit_net():
down_write(&rdma_nets_rwsem) <- held at line rdma_dev_exit_net+0x60
down_read(&devices_rwsem) <- waiting (shown in rwsem_down_write_slowpath)
Thread B (stress-ng-clone processes):
rdma_dev_init_net():
down_read(&devices_rwsem) <- held at line rdma_dev_init_net+0x120
down_read(&rdma_nets_rwsem) <- waiting (blocked by pending writer from Thread A)
The soft lockup in print_cpu() is a cascading effect: when /proc/sched_debug
is read, print_cpu() iterates over all processes under rcu_read_lock(). With
thousands of processes stuck in D state due to the RDMA deadlock, this
iteration takes 22+ seconds, exceeding the soft lockup threshold and
triggering kernel panic.
Solution:
Reorder lock acquisition in rdma_dev_exit_net() to match rdma_dev_init_net().
Both functions now acquire locks in the same order:
1. down_read(&devices_rwsem)
2. down_write/read(&rdma_nets_rwsem)
This prevents the deadlock as both code paths now follow consistent lock
ordering, which is a fundamental requirement for deadlock-free execution.
Tested with:
stress-ng --clone 100 --timeout 300s
No hung tasks or soft lockups observed after the fix.
Signed-off-by: Qiliang Yuan <yuanql9@...natelecom.cn>
Signed-off-by: wujing <realwujing@...com>
---
drivers/infiniband/core/device.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index d4263385850a..9ef2c966df8c 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -1119,6 +1119,13 @@ static void rdma_dev_exit_net(struct net *net)
unsigned long index;
int ret;
+ /*
+ * Fix ABBA deadlock: acquire locks in same order as rdma_dev_init_net
+ * to prevent deadlock with concurrent namespace operations.
+ * rdma_dev_init_net: devices_rwsem -> rdma_nets_rwsem
+ * rdma_dev_exit_net: devices_rwsem -> rdma_nets_rwsem (was reversed)
+ */
+ down_read(&devices_rwsem);
down_write(&rdma_nets_rwsem);
/*
* Prevent the ID from being re-used and hide the id from xa_for_each.
@@ -1126,8 +1133,6 @@ static void rdma_dev_exit_net(struct net *net)
ret = xa_err(xa_store(&rdma_nets, rnet->id, NULL, GFP_KERNEL));
WARN_ON(ret);
up_write(&rdma_nets_rwsem);
-
- down_read(&devices_rwsem);
xa_for_each (&devices, index, dev) {
get_device(&dev->dev);
/*
--
2.43.0
Powered by blists - more mailing lists