[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140603020001.GA4903@richard>
Date: Tue, 3 Jun 2014 10:00:01 +0800
From: Wei Yang <weiyang@...ux.vnet.ibm.com>
To: Bjorn Helgaas <bhelgaas@...gle.com>
Cc: Wei Yang <weiyang@...ux.vnet.ibm.com>,
Or Gerlitz <ogerlitz@...lanox.com>,
David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>, Amir Vadai <amirv@...lanox.com>,
Jack Morgenstein <jackm@....mellanox.co.il>
Subject: Re: [PATCH net] net/mlx4_core: Fix Oops on reboot when SRIOV VFs are
probed into the Host
On Mon, Jun 02, 2014 at 10:10:01AM -0600, Bjorn Helgaas wrote:
>On Mon, Jun 2, 2014 at 8:29 AM, Wei Yang <weiyang@...ux.vnet.ibm.com> wrote:
>> On Sun, Jun 01, 2014 at 01:49:43PM +0300, Or Gerlitz wrote:
>>>From: Jack Morgenstein <jackm@....mellanox.co.il>
>>>
>>>Commit befdf89 did not take into account the case where the Host
>>>driver is being unloaded. In this case, pci_get_drvdata for the VF
>>
>> In my mind, unloading PF's driver when there is alive VFs is not allowed.
>> Quoted in driver code:
>>
>> /* in SRIOV it is not allowed to unload the pf's
>> * driver while there are alive vf's */
>> if (mlx4_is_master(dev) && mlx4_how_many_lives_vf(dev))
>> printk(KERN_ERR "Removing PF when there are assigned VF's !!!\n");
>>
>> Actually, I don't understand this restriction clearly. Maybe my understanding
>> of alive VF is not correct.
>>
>> And in your code, unload PF's driver would call pci_disable_sriov() which will
>> destroy the VFs. While in your test, the VF's driver is still there?
>>
>>>remove_one call may return NULL, so that dereferencing the priv
>>>struct results in a kernel oops.
>>
>> Sorry for my poor mind, I still can't understand this situation.
>> Would you describe the situation more? You are unloading PF's driver in Host
>> at first, and then try to release the VF's driver?
>>
>>>
>>>The fix is to also test that the dev pointer returned by
>>>pci_get_drvdata is non-NULL.
>>>
>>>Fixes: befdf89 ("preserve pcd_dev_data after __mlx4_remove_one()")
>>>Signed-off-by: Jack Morgenstein <jackm@....mellanox.co.il>
>>>Signed-off-by: Or Gerlitz <ogerlitz@...lanox.com>
>>>---
>>> drivers/net/ethernet/mellanox/mlx4/main.c | 2 +-
>>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>>diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
>>>index c187d74..a6ae089 100644
>>>--- a/drivers/net/ethernet/mellanox/mlx4/main.c
>>>+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
>>>@@ -2629,7 +2629,7 @@ static void __mlx4_remove_one(struct pci_dev *pdev)
>>> int pci_dev_data;
>>> int p;
>>>
>>>- if (priv->removed)
>>>+ if (!dev || priv->removed)
>>> return;
>>
>> This fix looks good to me.
>>
>> As I remembered, I had this check in my first version, but I removed the check
>> on dev based on the suggestion from Bjorn. Since I agreed that there is no
>> chance for dev to be NULL. Bjorn, seems we are not correct :(
>
>Writing a driver is not an empirical process of trying things to see
>what works. You need to actively design a consistent structure so you
>know why and when things are safe. I object to gratuitous "dev ==
>NULL" checks because often they are just a way of patching up a driver
>design that isn't well thought-out.
>
>As I wrote before:
>
> From the PCI core's perspective, after .probe() returns successfully,
> we can call any driver entry point and pass the pci_dev to it, and
> expect it to work. Doing mlx4_remove_one() in mlx4_pci_err_detected()
> sort of breaks that assumption because you clear out pci_drvdata().
> Right now, the only other entry point mlx4 really implements is
> mlx4_remove_one(), and it has a hack that tests whether pci_drvdata()
> is NULL. But that's ... a hack, and you'll have to do the same
> if/when you implement suspend/resume/sriov_configure/etc.
Thanks for your kindness. After re-reading it, I understand it more, it is not
only related to the Mellanox driver, but also the whole picture about how to
write a driver.
1. We should make the driver entry save, after .probe() returns successfully.
2. If there is an exception and a hack to test the pci_drvdata(), we need to
have this hack in suspend/resum/etc.
Now back to the current mlx4 driver, mlx4_remove_one() is called by .shutdown
and .remove. In my mind, these two hook is invoked by rmmod or reboot. By
doing so, it is trying to comply with rule 1, make sure the pci_drvdata() is
valid, after .probe() succeed.
Then I am curious about in which case the driver break this rule.
Following is my suggestion:
1. To comply with rule 1, it would be better to fix this point instead of add
a hack.
2. Or to comply with rule 2, the driver needs to check pci_drvdata() in every
driver's entry instead of just in one driver entry. For example,
mlx4_pci_slot_reset() need this check too.
Bjorn, thanks again, hope my understanding this time is correct :-)
>
>Bjorn
--
Richard Yang
Help you, Help me
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists