[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID:
<LV2P220MB08459B430FFD8830782201B4D2BFA@LV2P220MB0845.NAMP220.PROD.OUTLOOK.COM>
Date: Sat, 25 Nov 2023 21:55:00 +0800
From: Ian Chen <free122448@...mail.com>
To: netdev@...r.kernel.org
Cc: Heiner Kallweit <hkallweit1@...il.com>
Subject: [BUG] r8169: deadlock when NetworkManager brings link up
Hello,
My home server runs Arch Linux with its stock kernel on a GIGABYTE Z790
AORUS ELITE AX with its builtin RTL8125B ethernet adapter.
After upgrading from 6.6.1.arch1 to 6.6.2.arch1, booting up the system
would end up in a state where all operations on any netlink socket
would block forever. The system is effectively unusable. Here's the
relevant dmesg:
kernel: INFO: task kworker/u64:2:218 blocked for more than 122 seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:kworker/u64:2 state:D stack:0 pid:218 ppid:2
flags:0x00004000
kernel: Workqueue: events_power_efficient crda_timeout_work [cfg80211]
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: crda_timeout_work+0x10/0x40 [cfg80211
d1ff02bd631e7b94dc4a8630ea4cdb5aede1cb9b]
kernel: process_one_work+0x171/0x340
kernel: worker_thread+0x27b/0x3a0
kernel: ? __pfx_worker_thread+0x10/0x10
kernel: kthread+0xe5/0x120
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork+0x31/0x50
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork_asm+0x1b/0x30
kernel: </TASK>
kernel: INFO: task kworker/5:1:250 blocked for more than 122 seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:kworker/5:1 state:D stack:0 pid:250 ppid:2
flags:0x00004000
kernel: Workqueue: events linkwatch_event
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: ? sched_clock+0x10/0x30
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: linkwatch_event+0x12/0x40
kernel: process_one_work+0x171/0x340
kernel: worker_thread+0x27b/0x3a0
kernel: ? __pfx_worker_thread+0x10/0x10
kernel: kthread+0xe5/0x120
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork+0x31/0x50
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork_asm+0x1b/0x30
kernel: </TASK>
kernel: INFO: task kworker/u64:6:290 blocked for more than 122 seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:kworker/u64:6 state:D stack:0 pid:290 ppid:2
flags:0x00004000
kernel: Workqueue: netns cleanup_net
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: wg_netns_pre_exit+0x19/0x100 [wireguard
0c090e6018e49e49957d27fd2202b1db304881dc]
kernel: cleanup_net+0x1e0/0x3b0
kernel: process_one_work+0x171/0x340
kernel: worker_thread+0x27b/0x3a0
kernel: ? __pfx_worker_thread+0x10/0x10
kernel: kthread+0xe5/0x120
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork+0x31/0x50
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork_asm+0x1b/0x30
kernel: </TASK>
kernel: INFO: task kworker/u64:19:577 blocked for more than 122
seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:kworker/u64:19 state:D stack:0 pid:577 ppid:2
flags:0x00004000
kernel: Workqueue: events_power_efficient reg_check_chans_work
[cfg80211]
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: ? _get_random_bytes+0xc0/0x1a0
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: ? finish_task_switch.isra.0+0x94/0x2f0
kernel: reg_check_chans_work+0x31/0x5b0 [cfg80211
d1ff02bd631e7b94dc4a8630ea4cdb5aede1cb9b]
kernel: process_one_work+0x171/0x340
kernel: worker_thread+0x27b/0x3a0
kernel: ? __pfx_worker_thread+0x10/0x10
kernel: kthread+0xe5/0x120
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork+0x31/0x50
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork_asm+0x1b/0x30
kernel: </TASK>
kernel: INFO: task kworker/u64:23:581 blocked for more than 122
seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:kworker/u64:23 state:D stack:0 pid:581 ppid:2
flags:0x00004000
kernel: Workqueue: events_power_efficient phy_state_machine [libphy]
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: phy_state_machine+0x47/0x2c0 [libphy
93248cd1d88abf54f1b4cc64a990177f549a7710]
kernel: process_one_work+0x171/0x340
kernel: worker_thread+0x27b/0x3a0
kernel: ? __pfx_worker_thread+0x10/0x10
kernel: kthread+0xe5/0x120
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork+0x31/0x50
kernel: ? __pfx_kthread+0x10/0x10
kernel: ret_from_fork_asm+0x1b/0x30
kernel: </TASK>
kernel: INFO: task NetworkManager:849 blocked for more than 122
seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:NetworkManager state:D stack:0 pid:849 ppid:1
flags:0x00004002
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: ? sysvec_apic_timer_interrupt+0xe/0x90
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: ? pci_conf1_write+0xae/0xf0
kernel: ? pcie_set_readrq+0x8e/0x160
kernel: phy_start_aneg+0x1d/0x40 [libphy
93248cd1d88abf54f1b4cc64a990177f549a7710]
kernel: rtl_reset_work+0x1bd/0x3b0 [r8169
08653ab60f23923c3943d53f140b2b697e265b93]
kernel: r8169_phylink_handler+0x5b/0x240 [r8169
08653ab60f23923c3943d53f140b2b697e265b93]
kernel: phy_link_change+0x2e/0x60 [libphy
93248cd1d88abf54f1b4cc64a990177f549a7710]
kernel: phy_check_link_status+0xad/0xe0 [libphy
93248cd1d88abf54f1b4cc64a990177f549a7710]
kernel: phy_start_aneg+0x25/0x40 [libphy
93248cd1d88abf54f1b4cc64a990177f549a7710]
kernel: rtl8169_change_mtu+0x24/0x60 [r8169
08653ab60f23923c3943d53f140b2b697e265b93]
kernel: dev_set_mtu_ext+0xf1/0x200
kernel: ? select_task_rq_fair+0x82c/0x1dd0
kernel: do_setlink+0x291/0x12d0
kernel: ? remove_entity_load_avg+0x31/0x80
kernel: ? sched_clock+0x10/0x30
kernel: ? sched_clock_cpu+0xf/0x190
kernel: ? __smp_call_single_queue+0xad/0x120
kernel: ? ttwu_queue_wakelist+0xef/0x110
kernel: ? __nla_validate_parse+0x61/0xd10
kernel: ? try_to_wake_up+0x2b7/0x640
kernel: __rtnl_newlink+0x651/0xa10
kernel: ? __kmem_cache_alloc_node+0x1a6/0x340
kernel: ? rtnl_newlink+0x2e/0x70
kernel: rtnl_newlink+0x47/0x70
kernel: rtnetlink_rcv_msg+0x14f/0x3c0
kernel: ? number+0x33b/0x3d0
kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10
kernel: netlink_rcv_skb+0x58/0x110
kernel: netlink_unicast+0x1a3/0x290
kernel: netlink_sendmsg+0x254/0x4d0
kernel: ____sys_sendmsg+0x396/0x3d0
kernel: ? copy_msghdr_from_user+0x7d/0xc0
kernel: ___sys_sendmsg+0x9a/0xe0
kernel: __sys_sendmsg+0x7a/0xd0
kernel: do_syscall_64+0x5d/0x90
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? do_syscall_64+0x6c/0x90
kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
kernel: RIP: 0033:0x7fc9232e7b3d
kernel: RSP: 002b:00007fffd4df2830 EFLAGS: 00000293 ORIG_RAX:
000000000000002e
kernel: RAX: ffffffffffffffda RBX: 0000000000000055 RCX:
00007fc9232e7b3d
kernel: RDX: 0000000000000000 RSI: 00007fffd4df2870 RDI:
000000000000000d
kernel: RBP: 00007fffd4df2c40 R08: 0000000000000000 R09:
0000000000000000
kernel: R10: 0000000000000000 R11: 0000000000000293 R12:
0000563fe71367c0
kernel: R13: 0000000000000001 R14: 0000000000000000 R15:
0000000000000000
kernel: </TASK>
kernel: INFO: task geoclue:1358 blocked for more than 122 seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:geoclue state:D stack:0 pid:1358 ppid:1
flags:0x00000002
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: __netlink_dump_start+0x75/0x290
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: rtnetlink_rcv_msg+0x277/0x3c0
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10
kernel: netlink_rcv_skb+0x58/0x110
kernel: netlink_unicast+0x1a3/0x290
kernel: netlink_sendmsg+0x254/0x4d0
kernel: __sys_sendto+0x1f6/0x200
kernel: __x64_sys_sendto+0x24/0x30
kernel: do_syscall_64+0x5d/0x90
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? syscall_exit_to_user_mode+0x2b/0x40
kernel: ? do_syscall_64+0x6c/0x90
kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
kernel: RIP: 0033:0x7f977ae729ec
kernel: RSP: 002b:00007ffeeb6aba50 EFLAGS: 00000246 ORIG_RAX:
000000000000002c
kernel: RAX: ffffffffffffffda RBX: 000056084849e910 RCX:
00007f977ae729ec
kernel: RDX: 0000000000000014 RSI: 00007ffeeb6abad0 RDI:
0000000000000007
kernel: RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000000
kernel: R10: 0000000000004000 R11: 0000000000000246 R12:
0000000000000014
kernel: R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
kernel: </TASK>
kernel: INFO: task pool-gnome-shel:1986 blocked for more than 122
seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:pool-gnome-shel state:D stack:0 pid:1986 ppid:1513
flags:0x00000002
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: __netlink_dump_start+0x75/0x290
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: rtnetlink_rcv_msg+0x277/0x3c0
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10
kernel: netlink_rcv_skb+0x58/0x110
kernel: netlink_unicast+0x1a3/0x290
kernel: netlink_sendmsg+0x254/0x4d0
kernel: __sys_sendto+0x1f6/0x200
kernel: __x64_sys_sendto+0x24/0x30
kernel: do_syscall_64+0x5d/0x90
kernel: ? syscall_exit_to_user_mode+0x2b/0x40
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? exc_page_fault+0x7f/0x180
kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
kernel: RIP: 0033:0x7f232af30bfc
kernel: RSP: 002b:00007f223e1fbba0 EFLAGS: 00000293 ORIG_RAX:
000000000000002c
kernel: RAX: ffffffffffffffda RBX: 00007f223e1fccc0 RCX:
00007f232af30bfc
kernel: RDX: 0000000000000014 RSI: 00007f223e1fccc0 RDI:
0000000000000028
kernel: RBP: 0000000000000000 R08: 00007f223e1fcc64 R09:
000000000000000c
kernel: R10: 0000000000000000 R11: 0000000000000293 R12:
0000000000000028
kernel: R13: 00007f223e1fcc80 R14: 0000000000000665 R15:
000055638262fd10
kernel: </TASK>
kernel: INFO: task evolution-sourc:1819 blocked for more than 122
seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:evolution-sourc state:D stack:0 pid:1819 ppid:1513
flags:0x00000006
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: ? netlink_lookup+0x151/0x1d0
kernel: __netlink_dump_start+0x75/0x290
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: rtnetlink_rcv_msg+0x277/0x3c0
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10
kernel: netlink_rcv_skb+0x58/0x110
kernel: netlink_unicast+0x1a3/0x290
kernel: netlink_sendmsg+0x254/0x4d0
kernel: __sys_sendto+0x1f6/0x200
kernel: __x64_sys_sendto+0x24/0x30
kernel: do_syscall_64+0x5d/0x90
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? sock_getsockopt+0x22/0x30
kernel: ? __fget_light+0x99/0x100
kernel: ? __sys_setsockopt+0x129/0x1d0
kernel: ? syscall_exit_to_user_mode+0x2b/0x40
kernel: ? do_syscall_64+0x6c/0x90
kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
kernel: RIP: 0033:0x7f6aa096c9ec
kernel: RSP: 002b:00007fff2b442820 EFLAGS: 00000246 ORIG_RAX:
000000000000002c
kernel: RAX: ffffffffffffffda RBX: 0000561e6b466d80 RCX:
00007f6aa096c9ec
kernel: RDX: 0000000000000014 RSI: 00007fff2b4428a0 RDI:
000000000000000a
kernel: RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000000
kernel: R10: 0000000000004000 R11: 0000000000000246 R12:
0000000000000014
kernel: R13: 00007fff2b442a70 R14: 0000000000000000 R15:
0000000000000001
kernel: </TASK>
kernel: INFO: task gnome-software:1904 blocked for more than 122
seconds.
kernel: Not tainted 6.6.2-arch1-1 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kernel: task:gnome-software state:D stack:0 pid:1904 ppid:1613
flags:0x00000002
kernel: Call Trace:
kernel: <TASK>
kernel: __schedule+0x3e8/0x1410
kernel: ? __pte_offset_map_lock+0x9e/0x110
kernel: schedule+0x5e/0xd0
kernel: schedule_preempt_disabled+0x15/0x30
kernel: __mutex_lock.constprop.0+0x39a/0x6a0
kernel: ? netlink_lookup+0x151/0x1d0
kernel: __netlink_dump_start+0x75/0x290
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: rtnetlink_rcv_msg+0x277/0x3c0
kernel: ? __pfx_rtnl_dump_all+0x10/0x10
kernel: ? __pfx_rtnetlink_rcv_msg+0x10/0x10
kernel: netlink_rcv_skb+0x58/0x110
kernel: netlink_unicast+0x1a3/0x290
kernel: netlink_sendmsg+0x254/0x4d0
kernel: __sys_sendto+0x1f6/0x200
kernel: __x64_sys_sendto+0x24/0x30
kernel: do_syscall_64+0x5d/0x90
kernel: ? __fget_light+0x99/0x100
kernel: ? __sys_setsockopt+0x129/0x1d0
kernel: ? syscall_exit_to_user_mode+0x2b/0x40
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? syscall_exit_to_user_mode+0x2b/0x40
kernel: ? do_syscall_64+0x6c/0x90
kernel: ? exc_page_fault+0x7f/0x180
kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
kernel: RIP: 0033:0x7fdbfd26d9ec
kernel: RSP: 002b:00007ffd15dd63e0 EFLAGS: 00000246 ORIG_RAX:
000000000000002c
kernel: RAX: ffffffffffffffda RBX: 000056133c78f580 RCX:
00007fdbfd26d9ec
kernel: RDX: 0000000000000014 RSI: 00007ffd15dd6460 RDI:
000000000000000b
kernel: RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000000
kernel: R10: 0000000000004000 R11: 0000000000000246 R12:
0000000000000014
kernel: R13: 00007ffd15dd6630 R14: 0000000000000000 R15:
0000000000000001
kernel: </TASK>
kernel: Future hung task reports are suppressed, see sysctl
kernel.hung_task_warnings
From the call traces, it seems that the issue is caused by commit
621735f590643e3048ca2060c285b80551660601 (r8169: fix rare issue with
broken rx after link-down on RTL8125), which got backported to 6.6.2.
Ian
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists