[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2409e7071482b8d05447b8660abcac15987ad399.camel@mellanox.com>
Date: Thu, 30 Apr 2020 15:58:04 +0000
From: Saeed Mahameed <saeedm@...lanox.com>
To: "schnelle@...ux.ibm.com" <schnelle@...ux.ibm.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"leon@...nel.org" <leon@...nel.org>
Subject: Re: [PATCH 1/1] net/mlx5: Call pci_disable_sriov() on remove
On Thu, 2020-04-30 at 14:03 +0200, Niklas Schnelle wrote:
> as described in Documentation/PCI/pci-iov-howto.rst a driver with SR-
> IOV
> support should call pci_disable_sriov() in the remove handler.
Hi Niklas,
looking at the documentation, it doesn't say "should" it just gives the
code as example.
> Otherwise removing a PF (e.g. via pci_stop_and_remove_bus_device())
> with
> attached VFs does not properly shut the VFs down before shutting down
> the PF. This leads to the VF drivers handling defunct devices and
> accompanying error messages.
>
Which should be the admin responsibility .. if the admin want to do
this, then let it be.. why block him ?
our mlx5 driver in the vf handles this gracefully and once pf
driver/device is back online the vf driver quickly recovers.
> In the current code pci_disable_sriov() is already called in
> mlx5_sriov_disable() but not in mlx5_sriov_detach() which is called
> from
> the remove handler. Fix this by moving the pci_disable_sriov() call
> into
> mlx5_device_disable_sriov() which is called by both.
>
> Signed-off-by: Niklas Schnelle <schnelle@...ux.ibm.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/sriov.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
> b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
> index 3094d20297a9..2401961c9f5b 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
> @@ -114,6 +114,8 @@ mlx5_device_disable_sriov(struct mlx5_core_dev
> *dev, int num_vfs, bool clear_vf)
> int err;
> int vf;
>
> + pci_disable_sriov(dev->pdev);
> +
> for (vf = num_vfs - 1; vf >= 0; vf--) {
> if (!sriov->vfs_ctx[vf].enabled)
> continue;
> @@ -156,7 +158,6 @@ static void mlx5_sriov_disable(struct pci_dev
> *pdev)
> struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
> int num_vfs = pci_num_vf(dev->pdev);
>
> - pci_disable_sriov(pdev);
this patch is no good as it breaks code symmetry.. and could lead to
many new issues.
> mlx5_device_disable_sriov(dev, num_vfs, true);
> }
>
Powered by blists - more mailing lists