lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <tencent_A0378C3D24B309598636129E2E26722E1106@qq.com>
Date: Wed, 5 Mar 2025 09:11:58 -0500
From: "ffhgfv" <744439878@...com>
To: "netdev" <netdev@...r.kernel.org>, "jiri" <jiri@...nulli.us>, "davem" <davem@...emloft.net>, "edumazet" <edumazet@...gle.com>, "kuba" <kuba@...nel.org>, "pabeni" <pabeni@...hat.com>, "horms" <horms@...nel.org>, "linux-kernel" <linux-kernel@...r.kernel.org>
Subject: deadlock in devlink_compat_running_version  and suggestions for fixing it

Hello, I found a bug titled " INFO: task hung in devlink_compat_running_version" with modified syzkaller in the lasted upstream related to devlink system.
If you fix this issue, please add the following tag to the commit:  Reported-by: Jianzhou Zhao <xnxc22xnxc22@...com>,    xingwei lee <xrivendell7@...il.com>, Zhizhuo Tang <strforexctzzchange@...mail.com>

------------[ cut here ]------------
TITLE:  INFO: task hung in devlink_compat_running_version 
==================================================================
INFO: task systemd-udevd:15007 blocked for more than 143 seconds.
      Not tainted 6.14.0-rc5-dirty #2
"echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:systemd-udevd   state:D stack:20128 pid:15007 tgid:15007 ppid:5221   task_flags:0x400140 flags:0x00004000
Call Trace:
 <task>
 context_switch kernel/sched/core.c:5378 [inline]
 __schedule+0xf26/0x57d0 kernel/sched/core.c:6765
 __schedule_loop kernel/sched/core.c:6842 [inline]
 schedule+0xe7/0x350 kernel/sched/core.c:6857
 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6914
 __mutex_lock_common kernel/locking/mutex.c:662 [inline]
 __mutex_lock+0x631/0xb00 kernel/locking/mutex.c:730
 devlink_compat_running_version+0xd5/0x7f0 net/devlink/dev.c:1224
 dev_ethtool+0x27a/0x330 net/ethtool/ioctl.c:3411
 dev_ioctl+0x2d4/0x10c0 net/core/dev_ioctl.c:759
 sock_do_ioctl+0x1ca/0x260 net/socket.c:1213
 sock_ioctl+0x23a/0x6c0 net/socket.c:1318
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:906 [inline]
 __se_sys_ioctl fs/ioctl.c:892 [inline]
 __x64_sys_ioctl+0x1a4/0x210 fs/ioctl.c:892
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcb/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f33c87aa237
RSP: 002b:00007ffd0cbbafd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00005559e7171c10 RCX: 00007f33c87aa237
RDX: 00007ffd0cbbb0a0 RSI: 0000000000008946 RDI: 0000000000000007
RBP: 00007ffd0cbbb0d0 R08: 00005559e7196eb0 R09: 0000000000000000
R10: 00007f33c85ec6c0 R11: 0000000000000246 R12: 00005559e7196eb0
R13: 00005559e719e090 R14: 00007ffd0cbbb0a0 R15: 0000000000000007
 </task>

Showing all locks held in the system:
3 locks held by systemd/1:
3 locks held by kworker/u10:0/29:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1327/0x1c30 kernel/workqueue.c:3221
 #1: ffffc9000050fd20 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x8f8/0x1c30 kernel/workqueue.c:3222
 #2: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: linkwatch_event+0xf/0x70 net/core/link_watch.c:285
1 lock held by khungtaskd/35:
 #0: ffffffff8dfbc1a0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #0: ffffffff8dfbc1a0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #0: ffffffff8dfbc1a0 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6746
2 locks held by kswapd0/98:
5 locks held by kworker/u10:3/256:
1 lock held by systemd-journal/5208:
 #0: ffff888046813df0 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: vma_start_read include/linux/mm.h:717 [inline]
 #0: ffff888046813df0 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: lock_vma_under_rcu+0x141/0x9a0 mm/memory.c:6378
1 lock held by systemd-udevd/5221:
 #0: ffff888045dd72e8 (mapping.invalidate_lock){++++}-{4:4}, at: filemap_invalidate_lock_shared include/linux/fs.h:932 [inline]
 #0: ffff888045dd72e8 (mapping.invalidate_lock){++++}-{4:4}, at: page_cache_ra_unbounded+0x173/0x790 mm/readahead.c:229
1 lock held by cron/8674:
 #0: ffff88804cdf1658 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: vma_start_read include/linux/mm.h:717 [inline]
 #0: ffff88804cdf1658 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: lock_vma_under_rcu+0x141/0x9a0 mm/memory.c:6378
3 locks held by sshd/9404:
1 lock held by syz-executor/9410:
 #0: ffff8880258d4730 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: vma_start_read include/linux/mm.h:717 [inline]
 #0: ffff8880258d4730 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: lock_vma_under_rcu+0x141/0x9a0 mm/memory.c:6378
3 locks held by kworker/u8:3/9993:
 #0: ffff88804a98f148 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x1327/0x1c30 kernel/workqueue.c:3221
 #1: ffffc90006dcfd20 ((work_completion)(&amp;(&amp;net-&gt;ipv6.addr_chk_work)-&gt;work)){+.+.}-{0:0}, at: process_one_work+0x8f8/0x1c30 kernel/workqueue.c:3222
 #2: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_net_lock include/linux/rtnetlink.h:129 [inline]
 #2: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: addrconf_verify_work+0x12/0x30 net/ipv6/addrconf.c:4730
2 locks held by kworker/u8:4/10878:
4 locks held by kworker/u8:5/12145:
 #0: ffff88801beeb948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work+0x1327/0x1c30 kernel/workqueue.c:3221
 #1: ffffc90003217d20 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work+0x8f8/0x1c30 kernel/workqueue.c:3222
 #2: ffffffff8fcd9450 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net+0xca/0xb90 net/core/net_namespace.c:606
 #3: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: wg_destruct+0x29/0x3d0 drivers/net/wireguard/device.c:246
1 lock held by systemd-udevd/15007:
 #0: ffff88805c834250 (&amp;devlink-&gt;lock_key#7){+.+.}-{4:4}, at: devlink_compat_running_version+0xd5/0x7f0 net/devlink/dev.c:1224
7 locks held by syz-executor/15828:
 #0: ffff88802657c420 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x122/0x240 fs/read_write.c:731
 #1: ffff88806c476088 (&amp;of-&gt;mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x27a/0x500 fs/kernfs/file.c:325
 #2: ffff88801fa39d28 (kn-&gt;active#63){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x29e/0x500 fs/kernfs/file.c:326
 #3: ffffffff8f29aec8 (nsim_bus_dev_list_lock){+.+.}-{4:4}, at: del_device_store+0xc9/0x4b0 drivers/net/netdevsim/bus.c:216
 #4: ffff88805c8350e8 (&amp;dev-&gt;mutex){....}-{4:4}, at: device_lock include/linux/device.h:1030 [inline]
 #4: ffff88805c8350e8 (&amp;dev-&gt;mutex){....}-{4:4}, at: __device_driver_lock drivers/base/dd.c:1095 [inline]
 #4: ffff88805c8350e8 (&amp;dev-&gt;mutex){....}-{4:4}, at: device_release_driver_internal+0xa4/0x620 drivers/base/dd.c:1293
 #5: ffff88805c834250 (&amp;devlink-&gt;lock_key#7){+.+.}-{4:4}, at: nsim_drv_remove+0x4a/0x1d0 drivers/net/netdevsim/dev.c:1675
 #6: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: unregister_nexthop_notifier+0x19/0x70 net/ipv4/nexthop.c:3906
2 locks held by ifquery/18079:
 #0: ffff8880766a66c8 (nlk_cb_mutex-ROUTE){+.+.}-{4:4}, at: __netlink_dump_start+0x156/0x980 net/netlink/af_netlink.c:2387
 #1: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
 #1: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_dumpit+0x199/0x200 net/core/rtnetlink.c:6780
2 locks held by syz-executor/18644:
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: tun_detach drivers/net/tun.c:698 [inline]
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: tun_chr_close+0x38/0x230 drivers/net/tun.c:3517
 #1: ffffffff8dfc7638 (rcu_state.exp_mutex){+.+.}-{4:4}, at: exp_funnel_lock+0x1a4/0x3a0 kernel/rcu/tree_exp.h:334
3 locks held by syz-executor/18673:
 #0: ffff8880273ecdf0 (&amp;hdev-&gt;req_lock){+.+.}-{4:4}, at: hci_dev_do_close+0x29/0xa0 net/bluetooth/hci_core.c:480
 #1: ffff8880273ec078 (&amp;hdev-&gt;lock){+.+.}-{4:4}, at: hci_dev_close_sync+0x35e/0x11a0 net/bluetooth/hci_sync.c:5185
 #2: ffff88805c5da350 (&amp;conn-&gt;lock#2){+.+.}-{4:4}, at: l2cap_conn_del+0x80/0x750 net/bluetooth/l2cap_core.c:1761
3 locks held by syz-executor/18700:
 #0: ffff88807639cdf0 (&amp;hdev-&gt;req_lock){+.+.}-{4:4}, at: hci_dev_do_close+0x29/0xa0 net/bluetooth/hci_core.c:480
 #1: ffff88807639c078 (&amp;hdev-&gt;lock){+.+.}-{4:4}, at: hci_dev_close_sync+0x35e/0x11a0 net/bluetooth/hci_sync.c:5185
 #2: ffff88807aecb350 (&amp;conn-&gt;lock#2){+.+.}-{4:4}, at: l2cap_conn_del+0x80/0x750 net/bluetooth/l2cap_core.c:1761
2 locks held by syz-executor/19212:
 #0: ffffffff8fcd9450 (pernet_ops_rwsem){++++}-{4:4}, at: copy_net_ns+0x28a/0x600 net/core/net_namespace.c:512
 #1: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: cangw_pernet_exit_batch+0x15/0xa0 net/can/gw.c:1257
2 locks held by syz-executor/19515:
 #0: ffffffff8fcd9450 (pernet_ops_rwsem){++++}-{4:4}, at: copy_net_ns+0x28a/0x600 net/core/net_namespace.c:512
 #1: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: ip_tunnel_init_net+0x20f/0x780 net/ipv4/ip_tunnel.c:1159
2 locks held by syz-executor/19552:
 #0: ffffffff8fcd9450 (pernet_ops_rwsem){++++}-{4:4}, at: copy_net_ns+0x28a/0x600 net/core/net_namespace.c:512
 #1: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: ip_tunnel_init_net+0x20f/0x780 net/ipv4/ip_tunnel.c:1159
2 locks held by ifquery/19620:
 #0: ffff888076d836c8 (nlk_cb_mutex-ROUTE){+.+.}-{4:4}, at: netlink_dump+0x663/0xcf0 net/netlink/af_netlink.c:2254
 #1: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
 #1: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_dumpit+0x199/0x200 net/core/rtnetlink.c:6780
2 locks held by syz-executor/19627:
 #0: ffff88807d654df0 (&amp;hdev-&gt;req_lock){+.+.}-{4:4}, at: hci_dev_do_close+0x29/0xa0 net/bluetooth/hci_core.c:480
 #1: ffff88807d654078 (&amp;hdev-&gt;lock){+.+.}-{4:4}, at: hci_dev_close_sync+0x35e/0x11a0 net/bluetooth/hci_sync.c:5185
1 lock held by syz-executor/19746:
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_net_lock include/linux/rtnetlink.h:129 [inline]
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: inet_rtm_newaddr+0x30f/0x1570 net/ipv4/devinet.c:987
1 lock held by syz-executor/19755:
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_net_lock include/linux/rtnetlink.h:129 [inline]
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: inet_rtm_newaddr+0x30f/0x1570 net/ipv4/devinet.c:987
1 lock held by syz-executor/19761:
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_net_lock include/linux/rtnetlink.h:129 [inline]
 #0: ffffffff8fcef528 (rtnl_mutex){+.+.}-{4:4}, at: inet_rtm_newaddr+0x30f/0x1570 net/ipv4/devinet.c:987
2 locks held by syz-executor/19787:
4 locks held by syz-executor/19789:
1 lock held by syz-executor/19843:
1 lock held by syz-executor/19845:
 #0: ffff88804d04e4a8 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: vma_start_read include/linux/mm.h:717 [inline]
 #0: ffff88804d04e4a8 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: lock_vma_under_rcu+0x141/0x9a0 mm/memory.c:6378
1 lock held by syz-executor/19847:
 #0: ffff8880281d2730 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: vma_start_read include/linux/mm.h:717 [inline]
 #0: ffff8880281d2730 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: lock_vma_under_rcu+0x141/0x9a0 mm/memory.c:6378
1 lock held by syz-executor/19850:
 #0: ffff88804d34bb68 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: vma_start_read include/linux/mm.h:717 [inline]
 #0: ffff88804d34bb68 (&amp;vma-&gt;vm_lock-&gt;lock){++++}-{4:4}, at: lock_vma_under_rcu+0x141/0x9a0 mm/memory.c:6378

=============================================

NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 35 Comm: khungtaskd Not tainted 6.14.0-rc5-dirty #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <task>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
 nmi_cpu_backtrace+0x2a0/0x350 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:236 [inline]
 watchdog+0xea3/0x1200 kernel/hung_task.c:399
 kthread+0x3b0/0x760 kernel/kthread.c:464
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </task>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 19787 Comm: syz-executor Not tainted 6.14.0-rc5-dirty #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:130 [inline]
RIP: 0010:memory_is_poisoned mm/kasan/generic.c:161 [inline]
RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline]
RIP: 0010:kasan_check_range+0x5a/0x1a0 mm/kasan/generic.c:189
Code: b8 00 00 00 4c 8d 54 37 ff 48 89 fd 48 b8 00 00 00 00 00 fc ff df 4d 89 d1 48 c1 ed 03 49 c1 e9 03 48 01 c5 49 01 c1 48 89 e8 &lt;49&gt; 8d 59 01 48 89 da 48 29 ea 48 83 fa 10 0f 8e 92 00 00 00 41 89
RSP: 0018:ffffc90002a77458 EFLAGS: 00000086
RAX: fffffbfff2d943a0 RBX: 0000000000000019 RCX: ffffffff81947d6e
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffffff96ca1d00
RBP: fffffbfff2d943a0 R08: 0000000000000000 R09: fffffbfff2d943a0
R10: ffffffff96ca1d07 R11: 0000000000000002 R12: 0000000000000000
R13: ffff88802b523108 R14: 0000000000000019 R15: ffff88802b522500
FS:  0000555589ac2500(0000) GS:ffff88802b800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005559aafa0ff0 CR3: 000000005ae6c000 CR4: 0000000000752ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 80000000
Call Trace:
 <nmi>
 </nmi>
 <task>
 instrument_atomic_read include/linux/instrumented.h:68 [inline]
 _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
 hlock_class+0x4e/0x130 kernel/locking/lockdep.c:230
 check_wait_context kernel/locking/lockdep.c:4853 [inline]
 __lock_acquire+0x451/0x3c80 kernel/locking/lockdep.c:5178
 lock_acquire.part.0+0x11b/0x370 kernel/locking/lockdep.c:5851
 rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 rcu_read_lock_sched include/linux/rcupdate.h:941 [inline]
 pfn_valid include/linux/mmzone.h:2067 [inline]
 pfn_valid include/linux/mmzone.h:2050 [inline]
 page_table_check_set+0x113/0x9f0 mm/page_table_check.c:110
 __page_table_check_ptes_set+0x28e/0x450 mm/page_table_check.c:225
 page_table_check_ptes_set include/linux/page_table_check.h:74 [inline]
 set_ptes include/linux/pgtable.h:288 [inline]
 __copy_present_ptes mm/memory.c:968 [inline]
 copy_present_ptes mm/memory.c:1051 [inline]
 copy_pte_range mm/memory.c:1174 [inline]
 copy_pmd_range mm/memory.c:1262 [inline]
 copy_pud_range mm/memory.c:1299 [inline]
 copy_p4d_range mm/memory.c:1323 [inline]
 copy_page_range+0x3048/0x4e30 mm/memory.c:1421
 dup_mmap kernel/fork.c:748 [inline]
 dup_mm kernel/fork.c:1700 [inline]
 copy_mm kernel/fork.c:1752 [inline]
 copy_process+0x7dea/0x8ab0 kernel/fork.c:2403
 kernel_clone+0xeb/0x920 kernel/fork.c:2815
 __do_sys_clone+0xcf/0x120 kernel/fork.c:2958
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcb/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5b34d9fec7
Code: 00 00 90 f3 0f 1e fa 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 &lt;48&gt; 3d 00 f0 ff ff 77 41 41 89 c0 85 c0 75 2c 64 48 8b 04 25 10 00
RSP: 002b:00007ffe3ca176c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 00007f5b35afd660 RCX: 00007f5b34d9fec7
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000080000000
R10: 0000555589ac27d0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000003 R14: 00007f5b34e4e881 R15: 0000000000000002
 </task>

==================================================================
I use the same kernel as syzbot instance upstream: 7eb172143d5508b4da468ed59ee857c6e5e01da6
kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&amp;x=da4b04ae798b7ef6
compiler: gcc version 11.4.0
===============================================================================
Unfortunately, the modified syzkaller does not generate an effective repeat program.
The following is my analysis of the bug and repair suggestions, hoping to help with the repair of the bug:
## Root cause analysis
The problem is with the devlink_compat_running_version function, which is calling the path that needs to get the RTNL lock (rtnl_mutex) while holding the devlink lock (devl_lock). 
There is another reverse lock acquisition sequence in the system (first RTNL lock and then devlink lock), resulting in lock inversion and deadlock. The specific performance is as follows:
devlink_compat_running_version()
    devl_lock(devlink);              // Holding devlink lock
    ↓
    __devlink_compat_running_version()
        → Implicitly calls an operation that requires rtnl_lock() // wait for the RTNL lock

linkwatch_event()
    rtnl_lock();                     // Holding devlink RTNL lock
    → Invoke the code path involving devlink
        devl_lock(devlink);          //Wait for the devlink lock

### Repair suggestions
Adjust the lock acquisition sequence. Force all code paths involving devlink locks and RTNL locks to lock in the order of RTNL locks → devlink locks.
Patch example:
void devlink_compat_running_version(struct devlink *devlink, char *buf, size_t len) 
{
    if (!devlink-&gt;ops-&gt;info_get)
        return;
 
+   rtnl_lock();  // Get the RTNL lock first
    devl_lock(devlink);
    if (devl_is_registered(devlink))
        __devlink_compat_running_version(devlink, buf, len); // Ensure that RTNL locks are not invoked internally
    devl_unlock(devlink);
+   rtnl_unlock();
}

=========================================================================
I hope it helps.
Best regards
Jianzhou Zhao
xingwei lee
Zhizhuo Tang</strforexctzzchange@...mail.com></xrivendell7@...il.com></xnxc22xnxc22@...com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ