[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190806133638.GQ151852@google.com>
Date: Tue, 6 Aug 2019 08:36:38 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Linux PCI <linux-pci@...r.kernel.org>,
Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Keith Busch <keith.busch@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J . Wysocki" <rjw@...ysocki.net>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
Linux PM <linux-pm@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/5] PCI / PM: Check for error when reading PME status
On Mon, Aug 05, 2019 at 11:02:51PM +0200, Rafael J. Wysocki wrote:
> On Mon, Aug 5, 2019 at 10:52 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
> >
> > pci_check_pme_status() reads the Power Management capability to determine
> > whether a device has generated a PME. The capability is in config space,
> > which is accessible in D0, D1, D2, and D3hot, but not in D3cold.
> >
> > If we call pci_check_pme_status() on a device that's in D3cold, config
> > reads fail and return ~0 data, which we erroneously interpreted as "the
> > device has generated a PME".
> >
> > 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> > avoided many of these problems by avoiding pci_check_pme_status() if we
> > think the device is in D3cold. However, it is not a complete fix because
> > the device may go to D3cold after we check its power state but before
> > pci_check_pme_status() reads the Power Management Status Register.
> >
> > Return false ("device has not generated a PME") if we get an error response
> > reading the Power Management Status Register.
> >
> > Fixes: 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> > Fixes: 71a83bd727cc ("PCI/PM: add runtime PM support to PCIe port")
> > Signed-off-by: Bjorn Helgaas <bhelgaas@...gle.com>
> > ---
> > drivers/pci/pci.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 984171d40858..af6a97d7012b 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -2008,6 +2008,9 @@ bool pci_check_pme_status(struct pci_dev *dev)
> >
> > pmcsr_pos = dev->pm_cap + PCI_PM_CTRL;
> > pci_read_config_word(dev, pmcsr_pos, &pmcsr);
> > + if (pmcsr == (u16) PCI_ERROR_RESPONSE)
> > + return false;
>
> No, sorry.
>
> We tried that and it didn't work.
>
> pcie_pme_handle_request() depends on this returning "true" for all
> bits set, as from its perspective "device is not accessible" may very
> well mean "device may have signaled PME".
Right, it's obviously wrong in the case of devices that advertise
D3cold in PME_Support, i.e., devices that can generate PME even with
main power off. Also, we may want to try to wake up devices if the
config read fails for a reason other than the device being in D3cold.
What I don't like about the current code is that it checks
PCI_PM_CTRL_PME_STATUS in data that may be completely bogus. Do you
think it would be better to do something like this:
pci_read_config_word(dev, pmcsr_pos, &pmcsr);
if (pmcsr == (u16) PCI_ERROR_RESPONSE) {
if (pci_pme_capable(dev, PCI_PM_CAP_PME_D3cold))
return true;
return false;
}
or maybe this:
pci_read_config_word(dev, pmcsr_pos, &pmcsr);
if (pmcsr == (u16) PCI_ERROR_RESPONSE)
return true;
We should get PCI_ERROR_RESPONSE pretty reliably from devices in
D3cold, so the first possibility would cover that case.
But since pci_check_pme_status() basically returns a hint ("true"
means a device *may* have generated a PME), and even if the hint is
wrong, the worst that happens is an unnecessary wakeup, maybe the
second possibility is safer.
What do you think?
> > if (!(pmcsr & PCI_PM_CTRL_PME_STATUS))
> > return false;
Powered by blists - more mailing lists