[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <BY5PR13MB3604D3031E984CA34A57B7C9EEA09@BY5PR13MB3604.namprd13.prod.outlook.com>
Date: Mon, 20 Sep 2021 20:22:44 +0000
From: <Patrick.Mclean@...y.com>
To: <stable@...r.kernel.org>
CC: <regressions@...ts.linux.dev>, <ayal@...dia.com>,
<saeedm@...dia.com>, <netdev@...r.kernel.org>, <leonro@...dia.com>,
<Aaron.U'ren@...y.com>, <Russell.Brown@...y.com>,
<Victor.Payno@...y.com>
Subject: mlx5_core 5.10 stable series regression starting at 5.10.65
In 5.10 stable kernels since 5.10.65 certain mlx5 cards are no longer usable (relevant dmesg logs and lspci output are pasted below).
Bisecting the problem tracks the problem down to this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=fe6322774ca28669868a7e231e173e09f7422118
Here is how lscpi -nn identifies the cards:
41:00.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
41:00.1 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
Here are the relevant dmesg logs:
[ 13.409473] mlx5_core 0000:41:00.0: firmware version: 16.31.1014
[ 13.415944] mlx5_core 0000:41:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)
[ 13.707425] mlx5_core 0000:41:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
[ 13.718221] mlx5_core 0000:41:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
[ 13.740607] mlx5_core 0000:41:00.0: Port module event: module 0, Cable plugged
[ 13.759857] mlx5_core 0000:41:00.0: mlx5_pcie_event:294:(pid 586): PCIe slot advertised sufficient power (75W).
[ 17.986973] mlx5_core 0000:41:00.0: E-Switch: cleanup
[ 18.686204] mlx5_core 0000:41:00.0: init_one:1371:(pid 803): mlx5_load_one failed with error code -22
[ 18.701352] mlx5_core: probe of 0000:41:00.0 failed with error -22
[ 18.727364] mlx5_core 0000:41:00.1: firmware version: 16.31.1014
[ 18.743853] mlx5_core 0000:41:00.1: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)
[ 19.015349] mlx5_core 0000:41:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
[ 19.025157] mlx5_core 0000:41:00.1: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
[ 19.053569] mlx5_core 0000:41:00.1: Port module event: module 1, Cable unplugged
[ 19.062093] mlx5_core 0000:41:00.1: mlx5_pcie_event:294:(pid 591): PCIe slot advertised sufficient power (75W).
[ 22.826932] mlx5_core 0000:41:00.1: E-Switch: cleanup
[ 23.544747] mlx5_core 0000:41:00.1: init_one:1371:(pid 803): mlx5_load_one failed with error code -22
[ 23.555071] mlx5_core: probe of 0000:41:00.1 failed with error -22
Please let me know if I can provide any further information.
Powered by blists - more mailing lists