lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <FC41C24E35F18A40888AACA1A36F3E418ADF339C@fmsmsx115.amr.corp.intel.com>
Date:	Fri, 27 Feb 2015 19:42:29 +0000
From:	"Nelson, Shannon" <shannon.nelson@...el.com>
To:	nick <xerofoify@...il.com>, Stefan Assmann <sassmann@...nic.de>,
	netdev <netdev@...r.kernel.org>
CC:	"e1000-devel@...ts.sourceforge.net" 
	<e1000-devel@...ts.sourceforge.net>,
	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>
Subject: RE: [E1000-devel] i40e: crash on NMI by continuous module reload

> From: nick [mailto:xerofoify@...il.com]
> On 2015-02-27 09:16 AM, Stefan Assmann wrote:
> > On 27.02.2015 15:02, nick wrote:
> >
> > [...]
> >
> >>>     i40e: Fix a bug where Rx would stop after some time
> >>> [...]
> >>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
> b/drivers/net/ethernet/intel/i40e/i40e_main.c
> >>> index f7464e8..ff6d94d 100644
> >>> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> >>> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> >>> [...]
> >>> @@ -9169,6 +9178,13 @@ static int i40e_probe(struct pci_dev *pdev,
> const struct pci_device_id *ent)
> >>>  	if (err)
> >>>  		dev_info(&pf->pdev->dev, "set phy mask fail, aq_err %d\n",
> err);
> >>>
> >>> +	msleep(75);
> >>> +	err = i40e_aq_set_link_restart_an(&pf->hw, true, NULL);
> >>> +	if (err) {
> >>> +		dev_info(&pf->pdev->dev, "link restart failed, aq_err=%d\n",
> >>> +			 pf->hw.aq.asq_last_status);
> >>> +	}
> >>> +
> >>>  	/* The main driver is (mostly) up and happy. We need to set this
> state
> >>>  	 * before setting up the misc vector or we get a race and the
> vector
> >>>  	 * ends up disabled forever.
> >>>
> >>> With this hunk removed the driver successfully unloaded/reloaded a
> >>> couple of hundred times. Would it be safe to just remove this hunk?
> >>> I haven't seen any negative effects by removing this yet.
> >>>
> >>>   Stefan
> >>>
> >> Stefan,
> >> I wouldn't remove them yet as this does look like a valid idea to
> check to see if the link is
> >> restarting successfully. On the other hand can you try removing the
> msleep line as this one is
> >> most likely causing the issue due to sleeping for some long in a
> probe function is generally a
> >> bad idea.
> >> Thanks,
> >> Nick
> >
> > Thanks Nick for the quick reply. I tested removing the msleep but that
> > didn't make a difference. You actually need to remove the complete
> hunk
> > to get a stable driver reload.
> >
> >   Stefan
> >
> Stefan,
> Basically there are a few things that could be going wrong
> 1. You are getting a error return for the
> function,i40e_aq_set_link_restart_an
> 2. You are trying to re able the device again when not needed
> 3. You are sending a NULL value to a field for command arguments that
> takes a 0 and not NULL
> to take no arguments
> Nick

First of all, I would make sure you've got a short sleep in between each load and unload in this stress test.  There's a lot going on under the covers in the Firmware that really should be allowed to settle out before jostling it again with another load/unload command.  

It would help to know what Firmware you have on your NIC - can you give us the output from "ethtool -i <ethX>"?

The out-of-tree driver has just (finally!) been updated on SourceForge, so you might give this version 1.2.37 driver a try to see if it changes your result.  That code still has the hunk in question, but protected by a FW version check.  The related patch will be headed upstream to net-next very soon.

Firmware updates have also just been released, but I'm not sure they've made it to the Intel Downloads site yet.  Updating your FW will make a difference.

sln



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ