lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 6 Apr 2012 10:41:44 -0300
From:	Henrique de Moraes Holschuh <hmh@....eng.br>
To:	Bjorn Helgaas <bhelgaas@...gle.com>
Cc:	Chris Boot <bootc@...tc.net>, Nix <nix@...eri.org.uk>,
	"Wyborny, Carolyn" <carolyn.wyborny@...el.com>,
	e1000-devel@...ts.sourceforge.net, netdev <netdev@...r.kernel.org>,
	lkml <linux-kernel@...r.kernel.org>, linux-pci@...r.kernel.org,
	Matthew Garrett <mjg@...hat.com>
Subject: Re: [E1000-devel] e1000e interface hang on 82574L

On Fri, 06 Apr 2012, Bjorn Helgaas wrote:
> On Fri, Apr 6, 2012 at 4:17 AM, Chris Boot <bootc@...tc.net> wrote:
> > On 19 Mar 2012, at 17:31, Nix wrote:
> >
> >> On 19 Mar 2012, Carolyn Wyborny said:
> >>
> >>>> you'll see that I tested that, and it doesn't work :( even if it
> >>>> did work, it shouldn't be needed: the driver attempts to turn off
> >>>> PCIe ASPM on affected NICs, and fails, apparently because
> >>>> *something* turns it back on again.
> >>>>
> >>> The driver attempts to disable L0s state, not the entire feature.
> >>> It
> >>
> >> It tries to disable L1 state as well (or it did when I tested this
> >> last, although I suspect you're right and it may leave L1 turned on
> >> these days: judging by the contents of e1000_82574_info, anyway.)
> >>
> >>> is also required that the device upstream on the bus from the
> >>> 82574L have this disabled. Yes, I agree there appears to be
> >>> something in the os that either ren-enables or fails to disable
> >>> the feature on the upstream device, as desired. Platforms/systems
> >>> also appear to vary in this regard, so the solutions may vary a
> >>> bit as well.
> >>>
> >>> Its worth trying your solution as well if what I suggested doesn't
> >>> work, but there is not one solution that fits all, unfortunately.
> >>
> >> I don't *have* a solution. :( 'setpci by hand some unknown amount
> >> of time after booting once the interface has stabilized' hardly
> >> counts as a solution of any sort. It's, at best, a workaround that
> >> lets me use my systems without hourly lockups until a real solution
> >> is found.
> >>
> >> (To clarify: manual setpci to force off the ASPM bits is the only
> >> thing that works for me. The driver's automatic disabling of L0s
> >> and L1 doesn't work: nor does booting with pcie_aspm=off. In both
> >> cases, I end up with both L0s and L1 turned on, and a lockup some
> >> time later, unless I setpci the bits off by hand.)
> >
> >
> > Well, with that setpci incantation run against the NIC and its
> > upstream device to disable ASPM L1s (setpci -s <dev>
> > CAP_EXP+10.b=40), everything has been working very well indeed. Is
> > there something the e1000e driver could do to disable L1s as well as
> > L0s if we know there's a problem with them for these devices?
> >
> > Adding Bjorn Helgaas and linux-pci to CCs to try to get the ball
> > rolling some more, as this is crippling without the fixes.
> 
> [+cc Matthew Garrett for ASPM stuff]
> 
> If I understand correctly, e1000e attempts to disable ASPM to work
> around an 82574L hardware erratum, but the PCI core either doesn't
> disable ASPM or it gets re-enabled somehow.

You probably need to disable it upstream of the 82574L as well.  Here
(SuperMicro C7X58) I managed to get it to be stable by telling the BIOS
to disable L0s and L1 system-wide.

But not all BIOSes will have that option...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists