lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <VI1PR04MB48457E23AA3354316C7782D69AD10@VI1PR04MB4845.eurprd04.prod.outlook.com>
Date:   Wed, 28 Nov 2018 04:31:56 +0000
From:   Bharat Bhushan <bharat.bhushan@....com>
To:     Alex Williamson <alex.williamson@...hat.com>,
        Bjorn Helgaas <helgaas@...nel.org>
CC:     "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "bharatb.yadav@...il.com" <bharatb.yadav@...il.com>,
        David Daney <david.daney@...ium.com>,
        Jan Glauber <jglauber@...ium.com>,
        Maik Broemme <mbroemme@...mpq.org>,
        Chris Blake <chrisrblake93@...il.com>
Subject: RE: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus

Hi,

> -----Original Message-----
> From: Alex Williamson <alex.williamson@...hat.com>
> Sent: Tuesday, November 27, 2018 9:39 PM
> To: Bjorn Helgaas <helgaas@...nel.org>
> Cc: Bharat Bhushan <bharat.bhushan@....com>; linux-pci@...r.kernel.org;
> linux-kernel@...r.kernel.org; bharatb.yadav@...il.com; David Daney
> <david.daney@...ium.com>; Jan Glauber <jglauber@...ium.com>; Maik
> Broemme <mbroemme@...mpq.org>; Chris Blake
> <chrisrblake93@...il.com>
> Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus
> 
> On Tue, 27 Nov 2018 09:33:56 -0600
> Bjorn Helgaas <helgaas@...nel.org> wrote:
> 
> > [+cc David, Jan, Alex, Maik, Chris]
> >
> > On Tue, Nov 27, 2018 at 08:46:33AM +0000, Bharat Bhushan wrote:
> > > NXP (Freescale Vendor ID) LS1088 chips do not behave correctly after
> > > bus reset with e1000e. Link state of device does not comes UP and so
> > > config space never accessible again.
> >
> > Previous similar commits:
> >
> >   822155100e58 ("PCI: Mark Cavium CN8xxx to avoid bus reset")
> >   8e2e03179923 ("PCI: Mark Atheros AR9580 to avoid bus reset")
> >   9ac0108c2bac ("PCI: Mark Atheros AR9485 and QCA9882 to avoid bus
> reset")
> >   c3e59ee4e766 ("PCI: Mark Atheros AR93xx to avoid bus reset")
> >
> > 1) Please make your subject match (remove the spurious "bus" at the
> > end)

Will correct, added by mistake 

> >
> > 2) This should probably be marked for stable (v3.14 and later, since
> > the quirk itself appeared in v3.19 and marked for v3.14 and later
> > stable kernels).  Maybe even mark it as "Fixes: c3e59ee4e766..." to
> > connect it.

Ok,

> >
> > 3) The 1957:80c0 PCI ID doesn't appear in
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpci
> -
> ids.ucw.cz%2F&amp;data=02%7C01%7Cbharat.bhushan%40nxp.com%7C296
> 02a2efa584249221808d65482945b%7C686ea1d3bc2b4c6fa92cd99c5c301635%7
> C0%7C0%7C636789317139032063&amp;sdata=3jkRMa1NljSCp%2BvZP0kgz7D
> PWPJZH8d7JXhCE5vCCMk%3D&amp;reserved=0; can you add it?
> >

Yes, I will add

> > 4) Is there a hardware erratum for this?  If so, please include the
> > URL here.

No h/w errata as of now.

> >
> > 5) Can you reproduce the problem using the same endpoint (e1000e) on a
> > different system with a different bridge?

I have multiple LS1088 boards and I can observe problem with all LS1088 boards.
While when I  uses same PCI device on other NXP board (LS2088) then it works fine.

> >
> > 6) Have you looked at this with a PCIe analyzer?  It would be very
> > interesting to compare the boot-time or system reboot path with the
> > individual bus reset path you're fixing.

I have not used PCIe analyzer, 

> >
> > Since there are several similar reports and they sometimes involve the
> > same devices (both your patch and 822155100e58 mention e1000e), I'm a
> > little suspicious that we're doing something wrong in the bus reset
> > path.
> 
> I agree, entirely excluding bus resets is not something to be taken lightly.  It's
> less than ideal for an endpoint and a fairly major functional gap for a
> downstream port.  It should really be considered a last resort.
> 
> > I think bus reset uses Secondary Bus Reset in the Bridge Control
> > register.  That's a generic mechanism that I would expect to be pretty
> > well-tested.  I suspect the BIOS probably uses it in the reboot path,
> > and the device probably works after that.
> >
> > So I wonder if the Linux delay isn't quite long enough, or our first
> > access to the device isn't quite right, e.g., maybe there's some issue
> > with the bus/device number capture (PCIe r4.0, sec 2.2.6.2).
> 
> Tweaking the delay would be a reasonable solution, though we are seeing
> some issues where users with lots of assigned devices that require bus
> resets experience long delays as vfio file descriptors are closed sequentially
> on exit.

In pci_reset_secondary_bus() I have tried to increase the delay after reset but not helped. 
Do I need to add delay at some other place as well? 

Thanks
-Bharat

>  So perhaps we could flag downstream ports requiring an extra delay,
> if that becomes a solution.  Your mention of the bus/device number also
> reminds me of the issue we saw on Threadripper where there were patches
> proposed to re-write the secondary and subordinate bus numbers after
> reset.  AMD was able to resolve that in a firmware update, but there could
> be something similar occurring here. Thanks,
> 
> Alex
> 
> > > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@....com>
> > > ---
> > >  drivers/pci/quirks.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index
> > > 4700d24e5d55..b9ae4e9f101a 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -3391,6 +3391,13 @@
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033,
> quirk_no_bus_reset);
> > >   */
> > >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100,
> > > quirk_no_bus_reset);
> > >
> > > +/*
> > > + * NXP (Freescale Vendor ID) LS1088 chips do not behave correctly
> > > +after
> > > + * bus reset. Link state of device does not comes UP and so config
> > > +space
> > > + * never accessible again.
> > > + */
> > > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, 0x80c0,
> > > +quirk_no_bus_reset);
> > > +
> > >  static void quirk_no_pm_reset(struct pci_dev *dev)  {
> > >  	/*
> > > --
> > > 2.19.1
> > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ