lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 18 Jul 2018 10:25:05 +0200
From:   Lukas Wunner <lukas@...ner.de>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Lyude Paul <lyude@...hat.com>, nouveau@...ts.freedesktop.org,
        David Airlie <airlied@...ux.ie>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        Ben Skeggs <bskeggs@...hat.com>,
        Linux PM <linux-pm@...r.kernel.org>
Subject: Re: [Nouveau] [PATCH 1/5] drm/nouveau: Prevent RPM callback
 recursion in suspend/resume paths

On Wed, Jul 18, 2018 at 09:38:41AM +0200, Rafael J. Wysocki wrote:
> On Tue, Jul 17, 2018 at 8:20 PM, Lukas Wunner <lukas@...ner.de> wrote:
> > Okay, the PCI device is suspending and the nvkm_i2c_aux_acquire()
> > wants it in resumed state, so is waiting forever for the device to
> > runtime suspend in order to resume it again immediately afterwards.
> >
> > The deadlock in the stack trace you've posted could be resolved using
> > the technique I used in d61a5c106351 by adding the following to
> > include/linux/pm_runtime.h:
> >
> > static inline bool pm_runtime_status_suspending(struct device *dev)
> > {
> >         return dev->power.runtime_status == RPM_SUSPENDING;
> > }
> >
> > static inline bool is_pm_work(struct device *dev)
> > {
> >         struct work_struct *work = current_work();
> >
> >         return work && work->func == dev->power.work;
> > }
> >
> > Then adding this to nvkm_i2c_aux_acquire():
> >
> >         struct device *dev = pad->i2c->subdev.device->dev;
> >
> >         if (!(is_pm_work(dev) && pm_runtime_status_suspending(dev))) {
> >                 ret = pm_runtime_get_sync(dev);
> >                 if (ret < 0 && ret != -EACCES)
> >                         return ret;
> >         }
[snip]
> 
> For the record, I don't quite like this approach as it seems to be
> working around a broken dependency graph.
> 
> If you need to resume device A from within the runtime resume callback
> of device B, then clearly B depends on A and there should be a link
> between them.
> 
> That said, I do realize that it may be the path of least resistance,
> but then I wonder if we can do better than this.

The GPU contains an i2c subdevice for each connector with DDC lines.
I believe those are modelled as children of the GPU's PCI device as
they're accessed via mmio of the PCI device.

The problem here is that when the GPU's PCI device runtime suspends,
its i2c child device needs to be runtime active to suspend the MST
topology.  Catch-22.

I don't know whether or not it's necessary to suspend the MST topology.
I'm not an expert on DisplayPort MultiStream transport.

BTW Lyude, in patch 4 and 5 of this series, you're runtime resuming
pad->i2c->subdev.device->dev.  Is this the PCI device or is it the i2c
device?  I'm always confused by nouveau's structs.  In nvkm_i2c_bus_ctor()
I can see that the device you're runtime resuming is the parent of the
i2c_adapter:

	struct nvkm_device *device = pad->i2c->subdev.device;
	[...]
	bus->i2c.dev.parent = device->dev;

If the i2c_adapter is a child of the PCI device, it's sufficient
to runtime resume the i2c_adapter, i.e. bus->i2c.dev, and this will
implicitly runtime resume its parent.

Thanks,

Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ