[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YI5E9mgNDzPMXTRh@unreal>
Date: Sun, 2 May 2021 09:21:42 +0300
From: Leon Romanovsky <leonro@...dia.com>
To: Dennis Afanasev <dennis.afanasev@...teless.net>,
Vlad Buslov <vladbu@...dia.com>,
Dmytro Linkin <dlinkin@...dia.com>, Roi Dayan <roid@...dia.com>
CC: <saeedm@...dia.com>, <netdev@...r.kernel.org>,
<linux-rdma@...r.kernel.org>
Subject: Re: PROBLEM: mlx5_core driver crashes when a VRF device with a route
is added with mlx5 devices in switchdev mode
Thanks for the report.
+ more people.
On Fri, Apr 30, 2021 at 04:56:17PM -0400, Dennis Afanasev wrote:
> Dear Saeed and Leo,
> I am reporting a bug in the mlx5_core driver discovered by our team at
> Stateless while setting up SRIOV devices in eswitch mode. Below are the
> details and relevant files that relate to the bug. Please reach out to me
> if I can provide any further information.
>
> 1.
>
> Description of problem: When creating SRIOV devices off physical mlx5
> PCIe devices and then putting the physical devices into switchdev mode,
> adding a new VRF device with a default route will cause the mlx5_core
> driver to segfault (replicate_bug1.sh). In addition, attempting to set the
> physical devices to switchdev mode after adding a VRF with a default route
> will cause the mlx5_core driver to segfault (replicate_bug2.sh). The seg
> fault occurs in the function mlx5e_tc_tun_fib_event in both cases.
> 2.
>
> Keywords: mlx5, ml5x_core, mlx5e_tc_tun_fib_event, tc, netdev, 5.12-rc7
> 3.
>
> Kernel information: Linux version 5.12.0-rc7 (root@...a) (gcc (Debian
> 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP
> 4.
>
> Kernel config file: File attached - config-5.12.0-rc7
> 5.
>
> Oops message: Files attached - dmesg_output_bug1 and dmesg_output_bug2
> 6.
>
> Shell script to replicate: Files attached - replicate_bug1.sh and
> replicate_bug2.sh
> 7.
>
> ver_linux output: File attached - ver_linux_output
> 8.
>
> Processor information: File attached - cpuinfo
> 9.
>
> Module information: File attached - modules
> 10.
>
> Loaded driver and hardware: Files attached - ioport and iomem
> 11.
>
> PCI information: File attached - pci_info
> 12.
>
> Other information - I hardcoded the values of the physical PCIe device
> and the address of the created SRIOV device. This will have to be adjusted
> depending on your machine.
> #!/bin/bash
>
> set -euxETo pipefail
>
> mst start
>
> # (Hardcoded) These need to be modified based on the host machine
> nic1_port0="0000:5e:00.0"
> nic1_port1="0000:5e:00.1"
>
> # Create 1 SRIOV device per NIC port
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port0/sriov_numvfs
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port1/sriov_numvfs
>
> # The SRIOV devices are given these addresses
> nic1_port0_vf="0000:5e:00.2"
> nic1_port1_vf="0000:5e:00.4"
>
> declare -ar PCIE_PHYSICAL_ADDRESSES=($nic1_port0 $nic1_port1)
> declare -ar PCIE_SRIOV_ADDRESSES=($nic1_port0_vf $nic1_port1_vf)
>
> # Unbind the driver from the SRIOV, required to activate the eswitch
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
> echo "${pcie_address}" > /sys/bus/pci/drivers/mlx5_core/unbind
> done
>
> # Wait for the binds to disappear
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
> declare sys_symlink_file="/sys/bus/pci/drivers/mlx5_core/${pcie_address}"
> until [[ ! -h "${sys_symlink_file}" ]]; do
> inotifywait --event delete_self --timeout 1 "${sys_symlink_file}" || true
> done
> done
> sync --file-system /sys
> udevadm settle --timeout=30
> sleep 5
>
> # Set the cards to 'switchdev'
> for pcie_address in "${PCIE_PHYSICAL_ADDRESSES[@]}"; do
> devlink dev eswitch set "pci/${pcie_address}" mode switchdev encap-mode basic
> done
>
> # Wait for the cards to be in switchdev mode
> for pcie_address in "${PCIE_PHYSICAL_ADDRESSES[@]}"; do
> until [[ "$(devlink -j dev eswitch show "pci/${pcie_address}" |
> jq --arg dev "pci/${pcie_address}" -r '.dev[$dev].mode' 2> /dev/null)" == "switchdev" ]]; do
> sleep 1
> done
> done
> sync --file-system /sys
> udevadm settle --timeout=30
> sleep 5
>
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
> echo "${pcie_address}" > /sys/bus/pci/drivers/mlx5_core/bind
> done
>
> ip link set group default up
> ip link add vrf0 type vrf table 100
>
> # This will crash the kernel
> ip route add table 100 unreachable default
> #!/bin/bash
>
> set -euxETo pipefail
>
> mst start
>
> # Add the VRF device and a route
> ip link add vrf0 type vrf table 100
> ip route add table 100 unreachable default
>
> # (Hardcoded) These need to be modified based on the host machine
> nic1_port0="0000:5e:00.0"
> nic1_port1="0000:5e:00.1"
>
> # Create 1 SRIOV device per NIC port
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port0/sriov_numvfs
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port1/sriov_numvfs
>
> # The SRIOV devices are given these addresses
> nic1_port0_vf="0000:5e:00.2"
> nic1_port1_vf="0000:5e:00.4"
>
> declare -ar PCIE_PHYSICAL_ADDRESSES=($nic1_port0 $nic1_port1)
> declare -ar PCIE_SRIOV_ADDRESSES=($nic1_port0_vf $nic1_port1_vf)
>
> # Unbind the driver from the SRIOV, required to activate the eswitch
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
> echo "${pcie_address}" > /sys/bus/pci/drivers/mlx5_core/unbind
> done
>
> # Wait for the binds to disappear
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
> declare sys_symlink_file="/sys/bus/pci/drivers/mlx5_core/${pcie_address}"
> until [[ ! -h "${sys_symlink_file}" ]]; do
> inotifywait --event delete_self --timeout 1 "${sys_symlink_file}" || true
> done
> done
> sync --file-system /sys
> udevadm settle --timeout=30
>
> # set the cards to 'switchdev'
> for pcie_address in "${PCIE_PHYSICAL_ADDRESSES[@]}"; do
> # This will crash the kernel
> devlink dev eswitch set "pci/${pcie_address}" mode switchdev encap-mode basic
> done
Powered by blists - more mailing lists