[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3960283.lbE9ESSj2m@aspire.rjw.lan>
Date: Sat, 15 Apr 2017 00:27:31 +0200
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Lukas Wunner <lukas@...ner.de>
Cc: Geert Uytterhoeven <geert@...ux-m68k.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Yinghai Lu <yinghai@...nel.org>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
Laurent Pinchart <laurent.pinchart@...asonboard.com>,
Simon Horman <horms@...ge.net.au>,
linux-pci <linux-pci@...r.kernel.org>,
Linux PM list <linux-pm@...r.kernel.org>,
Linux-Renesas <linux-renesas-soc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: PCI / PM: Crashes in PME scan during system suspend
On Friday, April 14, 2017 10:22:49 AM Lukas Wunner wrote:
> On Tue, Feb 14, 2017 at 12:26:01PM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, February 14, 2017 10:31:38 AM Geert Uytterhoeven wrote:
> > > Laurent Pinchart reported that r8a7790/Lager crashes during suspend tests.
> > >
> > > I managed to reproduce the issue on r8a7791/koelsch:
> > > - It only happens during suspend tests, after writing either "platform"
> > > or "processors" to /sys/power/pm_test,
> > > - It does not (or is less likely) to happen during full system suspend
> > > ("core" or "none").
> > >
> > > More investigation shows this happens when the PME scan runs, once per
> > > second. During PME scan, the PCI host bridge (rcar-pci) registers are
> > > accessed while the host bridge's module clock has already been disabled,
> > > leading to a crash.
> >
> > OK, so clearly PME scans should be suspended before the host bridge
> > registers become inaccessible.
> >
> > Another question, though, is whether or not PME scans are actually necessary
> > on the affected platforms at all.
>
> I'm not seeing a fix for this in linux-next, am I missing something?
> Has anyone looked into it or is the issue still open?
It is still open AFAICS.
> Below is a tentative patch which moves PME polling to a freezable
> workqueue, so it is frozen before the host bridge is suspended.
> Geert, Laurent, could you test this?
>
> The patch may be problematic in that pci_pme_list_scan() acquires
> pci_pme_list_mutex, which is also acquired by pci_pme_active(),
> which gets called when devices are suspended -- *after* the worker
> has been frozen. I'm not really familiar with the freezer, can it
> happen that the worker is frozen while holding the mutex? If so
> this would deadlock. Rafael?
That depends on the worker, precisely on where it calls try_to_freeze().
That said I think it won't do that while holding any locks. :-)
Thanks,
Rafael
Powered by blists - more mailing lists