[ 70.773478] IPv6: ADDRCONF(NETDEV_CHANGE): ens514np0: link becomes ready [ 70.858688] RPC: Registered named UNIX socket transport module. [ 70.858709] RPC: Registered udp transport module. [ 70.858714] RPC: Registered tcp transport module. [ 70.858719] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 70.922689] RPC: Registered rdma transport module. [ 70.922724] RPC: Registered rdma backchannel transport module. [ 92.022705] mlx5_core 0000:00:00.0: poll_health:702:(pid 0): Fatal error 1 detected [ 92.022995] mlx5_core 0000:00:00.0: print_health_info:380:(pid 0): assert_var[0] 0xffffffff [ 92.023002] mlx5_core 0000:00:00.0: print_health_info:380:(pid 0): assert_var[1] 0xffffffff [ 92.023009] mlx5_core 0000:00:00.0: print_health_info:380:(pid 0): assert_var[2] 0xffffffff [ 92.023016] mlx5_core 0000:00:00.0: print_health_info:380:(pid 0): assert_var[3] 0xffffffff [ 92.023023] mlx5_core 0000:00:00.0: print_health_info:380:(pid 0): assert_var[4] 0xffffffff [ 92.023030] mlx5_core 0000:00:00.0: print_health_info:383:(pid 0): assert_exit_ptr 0xffffffff [ 92.023038] mlx5_core 0000:00:00.0: print_health_info:385:(pid 0): assert_callra 0xffffffff [ 92.023048] mlx5_core 0000:00:00.0: print_health_info:388:(pid 0): fw_ver 65535.65535.65535 [ 92.023056] mlx5_core 0000:00:00.0: print_health_info:389:(pid 0): hw_id 0xffffffff [ 92.023065] mlx5_core 0000:00:00.0: print_health_info:390:(pid 0): irisc_index 255 [ 92.023075] mlx5_core 0000:00:00.0: print_health_info:391:(pid 0): synd 0xff: unrecognized error [ 92.023129] mlx5_core 0000:00:00.0: print_health_info:393:(pid 0): ext_synd 0xffff [ 92.023136] mlx5_core 0000:00:00.0: print_health_info:395:(pid 0): raw fw_ver 0xffffffff [ 92.023823] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=B, anc=0, erc=0, rsid=0 [ 92.023859] mlx5_core 0000:00:00.0: PME# disabled [ 102.751227] mlx5_core 0000:00:00.0: poll_health:717:(pid 10): device's health compromised - reached miss count [ 102.751247] mlx5_core 0000:00:00.0: print_health_info:380:(pid 10): assert_var[0] 0xffffffff [ 102.751253] mlx5_core 0000:00:00.0: print_health_info:380:(pid 10): assert_var[1] 0xffffffff [ 102.751260] mlx5_core 0000:00:00.0: print_health_info:380:(pid 10): assert_var[2] 0xffffffff [ 102.751266] mlx5_core 0000:00:00.0: print_health_info:380:(pid 10): assert_var[3] 0xffffffff [ 102.751273] mlx5_core 0000:00:00.0: print_health_info:380:(pid 10): assert_var[4] 0xffffffff [ 102.751280] mlx5_core 0000:00:00.0: print_health_info:383:(pid 10): assert_exit_ptr 0xffffffff [ 102.751287] mlx5_core 0000:00:00.0: print_health_info:385:(pid 10): assert_callra 0xffffffff [ 102.751296] mlx5_core 0000:00:00.0: print_health_info:388:(pid 10): fw_ver 65535.65535.65535 [ 102.751303] mlx5_core 0000:00:00.0: print_health_info:389:(pid 10): hw_id 0xffffffff [ 102.751311] mlx5_core 0000:00:00.0: print_health_info:390:(pid 10): irisc_index 255 [ 102.751401] mlx5_core 0000:00:00.0: print_health_info:391:(pid 10): synd 0xff: unrecognized error [ 102.751407] mlx5_core 0000:00:00.0: print_health_info:393:(pid 10): ext_synd 0xffff [ 102.751413] mlx5_core 0000:00:00.0: print_health_info:395:(pid 10): raw fw_ver 0xffffffff [ 157.068876] mlx5_core 0000:00:00.0: wait_func:1008:(pid 74): 2RST_QP(0x50a) timeout. Will cause a leak of a command resource [ 157.069145] infiniband mlx5_0: destroy_qp_common:2367:(pid 74): mlx5_ib: modify QP 0x000724 to RESET failed [ 177.548701] mlx5_core 0000:00:00.0: wait_func:1008:(pid 8): QUERY_VPORT_COUNTER(0x770) timeout. Will cause a leak of a command resource [ 218.508382] mlx5_core 0000:00:00.0: wait_func:1008:(pid 74): DESTROY_QP(0x501) timeout. Will cause a leak of a command resource [ 238.988229] mlx5_core 0000:00:00.0: wait_func:1008:(pid 8): QUERY_Q_COUNTER(0x773) timeout. Will cause a leak of a command resource [ 244.108211] INFO: task NetworkManager:681 blocked for more than 122 seconds. [ 244.108446] Not tainted 5.8.0-rc4 #1 [ 244.108450] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 244.108456] NetworkManager D 9240 681 1 0x00000000 [ 244.108467] Call Trace: [ 244.108479] [<00000000b9225c46>] __schedule+0x2d6/0x5a8 [ 244.108485] [<00000000b9225f72>] schedule+0x5a/0x130 [ 244.108491] [<00000000b9226704>] schedule_preempt_disabled+0x2c/0x48 [ 244.108497] [<00000000b92289c2>] __mutex_lock+0x372/0x960 [ 244.108503] [<00000000b9228fe2>] mutex_lock_nested+0x32/0x40 [ 244.108565] [<000003ff8020cbb2>] mlx5e_xdp+0x62/0xd0 [mlx5_core] [ 244.108573] [<00000000b8fbcc24>] __dev_xdp_query.part.0+0x34/0x70 [ 244.108582] [<00000000b8fe1d3c>] rtnl_xdp_fill+0x84/0x258 [ 244.108588] [<00000000b8fe5f54>] rtnl_fill_ifinfo+0x6a4/0xd68 [ 244.108595] [<00000000b8fe6f3e>] rtnl_getlink+0x266/0x3e8 [ 244.108602] [<00000000b8fe2cce>] rtnetlink_rcv_msg+0x18e/0x4b0 [ 244.108618] [<00000000b904913e>] netlink_rcv_skb+0x4e/0xf8 [ 244.108623] [<00000000b90488b2>] netlink_unicast+0x18a/0x268 [ 244.108627] [<00000000b9048cf6>] netlink_sendmsg+0x366/0x448 [ 244.108634] [<00000000b8f94b4c>] sock_sendmsg+0x64/0x78 [ 244.108638] [<00000000b8f965d6>] ____sys_sendmsg+0x1f6/0x240 [ 244.108643] [<00000000b8f98c04>] ___sys_sendmsg+0x74/0xa8 [ 244.108648] [<00000000b8f98cf4>] __sys_sendmsg+0x64/0xa8 [ 244.108652] [<00000000b8f995b4>] __do_sys_socketcall+0x2dc/0x340 [ 244.108676] [<00000000b922e618>] system_call+0xdc/0x2c0 [ 244.108682] 2 locks held by NetworkManager/681: [ 244.108688] #0: 00000000b972cae8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x160/0x4b0 [ 244.108698] #1: 00000000d3103b80 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_xdp+0x62/0xd0 [mlx5_core] [ 244.108731] Showing all locks held in the system: [ 244.108743] 3 locks held by kworker/u128:0/8: [ 244.108749] #0: 00000000d2f21148 ((wq_completion)mlx5e){+.+.}-{0:0}, at: process_one_work+0x1dc/0x478 [ 244.108764] #1: 000003e000047e18 ((work_completion)(&priv->update_stats_work)){+.+.}-{0:0}, at: process_one_work+0x1dc/0x478 [ 244.108771] #2: 00000000d3103b80 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_update_stats_work+0x34/0x68 [mlx5_core] [ 244.108794] 3 locks held by kworker/1:0/17: [ 244.108798] #0: 00000000e9922948 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x1dc/0x478 [ 244.108805] #1: 000003e00028fe18 ((addr_chk_work).work){+.+.}-{0:0}, at: process_one_work+0x1dc/0x478 [ 244.108813] #2: 00000000b972cae8 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_verify_work+0x22/0x38 [ 244.108822] 1 lock held by khungtaskd/29: [ 244.108826] #0: 00000000b969a990 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire.constprop.0+0x0/0x50 [ 244.108837] 6 locks held by kmcheck/74: [ 244.108841] #0: 00000000b9739e48 (crw_handler_mutex){+.+.}-{3:3}, at: crw_collect_info+0x250/0x350 [ 244.108849] #1: 00000000b9719668 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_stop_and_remove_bus_device_locked+0x26/0x48 [ 244.108858] #2: 00000000d8169238 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x32/0x50 [ 244.108866] #3: 00000000d426cbb8 (&dev->intf_state_mutex){+.+.}-{3:3}, at: mlx5_unload_one+0x38/0x140 [mlx5_core] [ 244.108887] #4: 000003ff8024a280 (mlx5_intf_mutex){+.+.}-{3:3}, at: mlx5_unregister_device+0x36/0xc8 [mlx5_core] [ 244.108908] #5: 00000000cf4f8650 (&device->unregistration_lock){+.+.}-{3:3}, at: __ib_unregister_device+0x32/0xc8 [ib_core] [ 244.108957] 2 locks held by NetworkManager/681: [ 244.108961] #0: 00000000b972cae8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x160/0x4b0 [ 244.108973] #1: 00000000d3103b80 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_xdp+0x62/0xd0 [mlx5_core]