lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140603084032.GA13874@richard>
Date:	Tue, 3 Jun 2014 16:40:32 +0800
From:	Wei Yang <weiyang@...ux.vnet.ibm.com>
To:	Or Gerlitz <or.gerlitz@...il.com>
Cc:	Bjorn Helgaas <bhelgaas@...gle.com>,
	David Miller <davem@...emloft.net>,
	Wei Yang <weiyang@...ux.vnet.ibm.com>,
	netdev <netdev@...r.kernel.org>, Amir Vadai <amirv@...lanox.com>,
	Jack Morgenstein <jackm@....mellanox.co.il>,
	Tal Alon <talal@...lanox.com>,
	Yevgeny Petrilin <yevgenyp@...lanox.com>
Subject: Re: [PATCH net] net/mlx4_core: Fix Oops on reboot when SRIOV VFs are
 probed into the Host

On Tue, Jun 03, 2014 at 11:15:43AM +0300, Or Gerlitz wrote:
>On Mon, Jun 2, 2014 at 7:10 PM, Bjorn Helgaas <bhelgaas@...gle.com> wrote:
>> Writing a driver is not an empirical process of trying things to see
>> what works.  You need to actively design a consistent structure so you
>> know why and when things are safe.  I object to gratuitous "dev ==
>> NULL" checks because often they are just a way of patching up a driver
>> design that isn't well thought-out.
>
>Bjorn, 1st and most -- Agreed.
>
>Next, to be precise, the use case of rebooting the host while the
>driver was loaded in SRIOV mode and NO VFs probed to VMs worked before
>commit befdf89 and is now broken.
>
>Reading further your response, I understand that the code was probably
>using a sort of hackish branching to make that to happen, and you
>suggest we re-write that section properly so it can serve well when
>(hopefully soon) implemenet
>sriov_configure and possibly also suspend/resume, point taken.
>
>Dave, as for this patch, again, the regression of inability to reboot
>the host node
>while the driver is loaded exists in the latest upstream code as of
>befdf89 / 3.15-rc1
>
>Now, taking into account that 3.15 is after rc8 and the IL devel team
>has a holiday this week, I don't see us coming in time with a more
>deeper fix for 3.15, so maybe you can eventaully go and merge this one
>liner for 3.15?

I am glad to verify your patch, if you wish.

>
>Or.
>
>
>> As I wrote before:
>>   From the PCI core's perspective, after .probe() returns successfully,
>>   we can call any driver entry point and pass the pci_dev to it, and
>>   expect it to work.  Doing mlx4_remove_one() in mlx4_pci_err_detected()
>>   sort of breaks that assumption because you clear out pci_drvdata().
>>   Right now, the only other entry point mlx4 really implements is
>>   mlx4_remove_one(), and it has a hack that tests whether pci_drvdata()
>>   is NULL.  But that's ... a hack, and you'll have to do the same
>>   if/when you implement suspend/resume/sriov_configure/etc.

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ