[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250117-frisky-macho-bustard-e92632@leitao>
Date: Fri, 17 Jan 2025 04:08:53 -0800
From: Breno Leitao <leitao@...ian.org>
To: michael.chan@...adcom.com, pavan.chebbi@...adcom.com
Cc: netdev@...r.kernel.org, kuba@...nel.org, kernel-team@...a.com
Subject: bnxt_en: NETDEV WATCHDOG in 6.13-rc7
Hello,
I am deploying 6.13-rc7 at commit 619f0b6fad52 ("Merge tag 'seccomp-v6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux")
in a machine with Broadcom BCM57452 NetXtreme-E 10Gb/25Gb/40Gb/50Gb and
the machine's network is down, with some error messages and NETDEV
WATCHDOG kicking in.
Are you guys familiar with something similar ?
Here are some of the messages. Examples:
bnxt_en 0000:04:00.0 eth0: NETDEV WATCHDOG: CPU: 2: transmit queue 1 timed out 5123 ms
bnxt_en 0000:04:00.0 eth0: TX timeout detected, starting reset task!
bnxt_en 0000:04:00.0 eth0: [0.0]: tx{fw_ring: 0 prod: a cons: 8}
bnxt_en 0000:04:00.0 eth0: [0]: rx{fw_ring: 0 prod: 1ff} rx_agg{fw_ring: 9 agg_prod: 7fc sw_agg_prod: 7fc}
Later I am getting this hung task report:
Tainted: G N 6.13.0-rc7-kbuilder-00043-g619f0b6fad52 #3
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000
Call Trace:
<TASK>
__schedule+0xb72/0x3690
? __pfx___schedule+0x10/0x10
? __pfx_lock_release+0x10/0x10
schedule+0xea/0x3c0
async_synchronize_cookie_domain+0x1b8/0x210
? __pfx_async_synchronize_cookie_domain+0x10/0x10
? __pfx_autoremove_wake_function+0x10/0x10
? kernel_init_freeable+0x500/0x6d0
? __pfx_kernel_init+0x10/0x10
kernel_init+0x24/0x1e0
? _raw_spin_unlock_irq+0x33/0x50
ret_from_fork+0x31/0x70
? __pfx_kernel_init+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Showing all locks held in the system:
3 locks held by kworker/u144:0/11:
#0: ffff88810a1b5948 ((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x1090/0x1950
#1: ffffc9000013fda0 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x7eb/0x1950
#2: ffff8881128081b0 (&dev->mutex){....}-{4:4}, at: __driver_attach_async_helper+0xa4/0x260
1 lock held by khungtaskd/203:
#0: ffffffff8669a1e0 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x75/0x330
7 locks held by kworker/u144:3/208:
4 locks held by kworker/u144:4/290:
#0: ffff88811db39948 ((wq_completion)bnxt_pf_wq){+.+.}-{0:0}, at: process_one_work+0x1090/0x1950
#1: ffffc9000303fda0 ((work_completion)(&bp->sp_task)){+.+.}-{0:0}, at: process_one_work+0x7eb/0x1950
#2: ffffffff86f71208 (rtnl_mutex){+.+.}-{4:4}, at: bnxt_reset+0x30/0xa0
#3: ffff88811e41d160 (&bp->hwrm_cmd_lock){+.+.}-{4:4}, at: __hwrm_send+0x2f6/0x28d0
3 locks held by kworker/u144:6/322:
#0: ffff88810812a948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1090/0x1950
#1: ffffc90003a4fda0 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x7eb/0x1950
#2: ffffffff86f71208 (rtnl_mutex){+.+.}-{4:4}, at: linkwatch_event+0xe/0x60
=============================================
Full log at https://pastebin.com/4pWmaayt
Thanks
--breno
Powered by blists - more mailing lists