lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1590F833-7D40-42FE-8FA2-6DCCADF9C6B0@bootc.net>
Date:	Fri, 6 Apr 2012 11:17:23 +0100
From:	Chris Boot <bootc@...tc.net>
To:	Nix <nix@...eri.org.uk>
Cc:	"Wyborny, Carolyn" <carolyn.wyborny@...el.com>,
	e1000-devel@...ts.sourceforge.net, netdev <netdev@...r.kernel.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org
Subject: Re: [E1000-devel] e1000e interface hang on 82574L

On 19 Mar 2012, at 17:31, Nix wrote:

> On 19 Mar 2012, Carolyn Wyborny said:
> 
>>> you'll see that I tested that, and it doesn't work :( even if it did
>>> work, it shouldn't be needed: the driver attempts to turn off PCIe ASPM
>>> on affected NICs, and fails, apparently because *something* turns it
>>> back on again.
>>> 
>> The driver attempts to disable L0s state, not the entire feature. It
> 
> It tries to disable L1 state as well (or it did when I tested this last,
> although I suspect you're right and it may leave L1 turned on these
> days: judging by the contents of e1000_82574_info, anyway.)
> 
>> is also required that the device upstream on the bus from the 82574L
>> have this disabled. Yes, I agree there appears to be something in the
>> os that either ren-enables or fails to disable the feature on the
>> upstream device, as desired. Platforms/systems also appear to vary in
>> this regard, so the solutions may vary a bit as well.
>> 
>> Its worth trying your solution as well if what I suggested doesn't
>> work, but there is not one solution that fits all, unfortunately.
> 
> I don't *have* a solution. :( 'setpci by hand some unknown amount of
> time after booting once the interface has stabilized' hardly counts as a
> solution of any sort. It's, at best, a workaround that lets me use my
> systems without hourly lockups until a real solution is found.
> 
> (To clarify: manual setpci to force off the ASPM bits is the only thing
> that works for me. The driver's automatic disabling of L0s and L1
> doesn't work: nor does booting with pcie_aspm=off. In both cases, I end
> up with both L0s and L1 turned on, and a lockup some time later, unless
> I setpci the bits off by hand.)


Well, with that setpci incantation run against the NIC and its upstream device to disable ASPM L1s (setpci -s <dev> CAP_EXP+10.b=40), everything has been working very well indeed. Is there something the e1000e driver could do to disable L1s as well as L0s if we know there's a problem with them for these devices?

Adding Bjorn Helgaas and linux-pci to CCs to try to get the ball rolling some more, as this is crippling without the fixes.

Cheers,
Chris

-- 
Chris Boot
bootc@...tc.net

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ