[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0c9326c943c0e6aa572cc132ee2deb952bf41c7f.camel@linux.ibm.com>
Date: Tue, 07 Sep 2021 14:21:17 +0200
From: Niklas Schnelle <schnelle@...ux.ibm.com>
To: "Oliver O'Halloran" <oohall@...il.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
Linas Vepstas <linasvepstas@...il.com>,
Russell Currey <ruscur@...sell.cc>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-s390@...r.kernel.org,
Matthew Rosato <mjrosato@...ux.ibm.com>,
Pierre Morel <pmorel@...ux.ibm.com>
Subject: Re: [PATCH 0/5] s390/pci: automatic error recovery
On Tue, 2021-09-07 at 10:45 +0200, Niklas Schnelle wrote:
> On Tue, 2021-09-07 at 12:04 +1000, Oliver O'Halloran wrote:
> > On Mon, Sep 6, 2021 at 7:49 PM Niklas Schnelle <schnelle@...ux.ibm.com> wrote:
> > > Patch 3 I already sent separately resulting in the discussion below but without
> > > a final conclusion.
> > >
> > > https://lore.kernel.org/lkml/20210720150145.640727-1-schnelle@linux.ibm.com/
> > >
> > > I believe even though there were some doubts about the use of
> > > pci_dev_is_added() by arch code the existing uses as well as the use in the
> > > final patch of this series warrant this export.
> >
> > The use of pci_dev_is_added() in arch/powerpc was because in the past
> > pci_bus_add_device() could be called before pci_device_add(). That was
> > fixed a while ago so It should be safe to remove those calls now.
>
> Hmm, ok that confirms Bjorns suspicion and explains how it came to be.
> I can certainly sent a patch for that. This would then leave only the
> existing use in s390 which I added because of a dead lock prevention
> and explained here:
> https://lore.kernel.org/lkml/87d15d5eead35c9eaa667958d057cf4a81a8bf13.camel@linux.ibm.com/
>
> Plus the need to use it in the recovery code of this series. I think in
> the EEH code the need for a similar check is alleviated by the checks
> in the beginning of
> arch/powerpc/kernel/eeh_driver.c:eeh_handle_normal_event() especially
> eeh_slot_presence_check() which checks presence via the hotplug slot.
> I guess we could use our own state tracking in a similar way but felt
> like pci_dev_is_added() is the more logical choice.
Looking into this again, I think we actually can't easily track this
state ourselves outside struct pci_dev. The reason for this is that
when e.g. arch/s390/pci/pci_sysfs.c:recover_store() removes the struct
pci_dev and scans it again the new struct pci_dev re-uses the same
struct zpci_dev because from a platform point of view the PCI device
was never removed but only disabled and re-enabled. Thus we can only
distinguish the stale struct pci_dev by looking at things stored in
struct pci_dev itself.
That said, I think for the recovery case we might be able to drop the
pci_dev_is_added() and rely on pdev->driver != NULL which we check
anyway and that should catch any PCI device that was already removed.
Powered by blists - more mailing lists