[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0c59cb62-3156-54bb-0f36-837369adf220@linux.ibm.com>
Date: Fri, 1 May 2020 00:31:53 +0200
From: Niklas Schnelle <schnelle@...ux.ibm.com>
To: Saeed Mahameed <saeedm@...lanox.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"leon@...nel.org" <leon@...nel.org>
Subject: Re: [PATCH 1/1] net/mlx5: Call pci_disable_sriov() on remove
On 4/30/20 9:47 PM, Niklas Schnelle wrote:
>
>
> On 4/30/20 5:58 PM, Saeed Mahameed wrote:
>> On Thu, 2020-04-30 at 14:03 +0200, Niklas Schnelle wrote:
>>> as described in Documentation/PCI/pci-iov-howto.rst a driver with SR-
>>> IOV
>>> support should call pci_disable_sriov() in the remove handler.
>>
>> Hi Niklas,
>>
>> looking at the documentation, it doesn't say "should" it just gives the
>> code as example.
>>
>>> Otherwise removing a PF (e.g. via pci_stop_and_remove_bus_device())
>>> with
>>> attached VFs does not properly shut the VFs down before shutting down
>>> the PF. This leads to the VF drivers handling defunct devices and
>>> accompanying error messages.
>>>
>>
>> Which should be the admin responsibility .. if the admin want to do
>> this, then let it be.. why block him ?
>>
>> our mlx5 driver in the vf handles this gracefully and once pf
>> driver/device is back online the vf driver quickly recovers.
> See my answer to your other answer ;-)
>>
>>> In the current code pci_disable_sriov() is already called in
>>> mlx5_sriov_disable() but not in mlx5_sriov_detach() which is called
>>> from
>>> the remove handler. Fix this by moving the pci_disable_sriov() call
>>> into
>>> mlx5_device_disable_sriov() which is called by both.
>>>
>>> Signed-off-by: Niklas Schnelle <schnelle@...ux.ibm.com>
>>> ---
>>> drivers/net/ethernet/mellanox/mlx5/core/sriov.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
>>> b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
>>> index 3094d20297a9..2401961c9f5b 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
>>> @@ -114,6 +114,8 @@ mlx5_device_disable_sriov(struct mlx5_core_dev
>>> *dev, int num_vfs, bool clear_vf)
>>> int err;
>>> int vf;
>>>
>>> + pci_disable_sriov(dev->pdev);
>>> +
>>> for (vf = num_vfs - 1; vf >= 0; vf--) {
>>> if (!sriov->vfs_ctx[vf].enabled)
>>> continue;
>>> @@ -156,7 +158,6 @@ static void mlx5_sriov_disable(struct pci_dev
>>> *pdev)
>>> struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
>>> int num_vfs = pci_num_vf(dev->pdev);
>>>
>>> - pci_disable_sriov(pdev);
>>
>> this patch is no good as it breaks code symmetry.. and could lead to
>> many new issues.
> Ah you're right I totally missed that there is a matching pci_enable_sriov() in
> mlx5_enable_sriov() haven't used these myself before and since it wasn't in the
> documentation example I somehow expected it to happen in non-driver code,
aaand it actually is in the documentation example and I definitely sent this
when it wasn't ready, sorry again…
> so for symmetry that would also have to move to mlx5_device_enable_sriov(),
> sorry for the oversight.
>>
>>
>>> mlx5_device_disable_sriov(dev, num_vfs, true);
>>> }
>>>
>>
Powered by blists - more mailing lists