[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d4b3a716f85017c17c52a85915fba9e19509e81.camel@kernel.crashing.org>
Date: Thu, 16 Jul 2020 08:49:21 +1000
From: Benjamin Herrenschmidt <benh@...nel.crashing.org>
To: Bjorn Helgaas <helgaas@...nel.org>,
David Laight <David.Laight@...LAB.COM>
Cc: "'Oliver O'Halloran'" <oohall@...il.com>,
Arnd Bergmann <arnd@...db.de>, Keith Busch <kbusch@...nel.org>,
Paul Mackerras <paulus@...ba.org>,
sparclinux <sparclinux@...r.kernel.org>,
Toan Le <toan@...amperecomputing.com>,
Greg Ungerer <gerg@...ux-m68k.org>,
Marek Vasut <marek.vasut+renesas@...il.com>,
Rob Herring <robh@...nel.org>,
Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Sagi Grimberg <sagi@...mberg.me>,
Russell King <linux@...linux.org.uk>,
Ley Foon Tan <ley.foon.tan@...el.com>,
Christoph Hellwig <hch@....de>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Kevin Hilman <khilman@...libre.com>,
linux-pci <linux-pci@...r.kernel.org>,
Jakub Kicinski <kuba@...nel.org>,
Matt Turner <mattst88@...il.com>,
"linux-kernel-mentees@...ts.linuxfoundation.org"
<linux-kernel-mentees@...ts.linuxfoundation.org>,
Guenter Roeck <linux@...ck-us.net>,
Ray Jui <rjui@...adcom.com>, Jens Axboe <axboe@...com>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>,
Shuah Khan <skhan@...uxfoundation.org>,
"bjorn@...gaas.com" <bjorn@...gaas.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Richard Henderson <rth@...ddle.net>,
Juergen Gross <jgross@...e.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Scott Branden <sbranden@...adcom.com>,
Jingoo Han <jingoohan1@...il.com>,
"Saheed O. Bolarinwa" <refactormyself@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Philipp Zabel <p.zabel@...gutronix.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Gustavo Pimentel <gustavo.pimentel@...opsys.com>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
"David S. Miller" <davem@...emloft.net>,
Heiner Kallweit <hkallweit1@...il.com>
Subject: Re: [RFC PATCH 00/35] Move all PCIBIOS* definitions into arch/x86
On Wed, 2020-07-15 at 17:12 -0500, Bjorn Helgaas wrote:
> > I've 'played' with PCIe error handling - without much success.
> > What might be useful is for a driver that has just read ~0u to
> > be able to ask 'has there been an error signalled for this device?'.
>
> In many cases a driver will know that ~0 is not a valid value for the
> register it's reading. But if ~0 *could* be valid, an interface like
> you suggest could be useful. I don't think we have anything like that
> today, but maybe we could. It would certainly be nice if the PCI core
> noticed, logged, and cleared errors. We have some of that for AER,
> but that's an optional feature, and support for the error bits in the
> garden-variety PCI_STATUS register is pretty haphazard. As you note
> below, this sort of SERR/PERR reporting is frequently hard-wired in
> ways that takes it out of our purview.
We do have pci_channel_state (via pci_channel_offline()) which covers
the cases where the underlying error handling (such as EEH or unplug)
results in the device being offlined though this tend to be
asynchronous so it might take a few ~0's before you get it.
It's typically used to break potentially infinite loops in some
drivers.
There is no interface to check whether *an* error happened though for
the most cases it will be captured in the status register, which is
harvested (and cleared ?) by some EDAC drivers iirc...
All this lacks coordination, I agree.
Cheers,
Ben.
Powered by blists - more mailing lists