lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e45b65b2151293d91dfc5811f64f3453f591f9d.camel@redhat.com>
Date:   Wed, 24 Apr 2019 15:16:37 -0400
From:   Lyude Paul <lyude@...hat.com>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     David Ober <dober@...ovo.com>, linux-pci@...r.kernel.org,
        nouveau@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        Karol Herbst <kherbst@...hat.com>,
        Ben Skeggs <skeggsb@...il.com>, stable@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Hans de Goede <hdegoede@...hat.com>
Subject: Re: [PATCH] pci/quirks: Add quirk to reset nvgpu at boot for the
 Lenovo ThinkPad P50

On Wed, 2019-04-24 at 13:59 -0500, Bjorn Helgaas wrote:
> Not being a scheduled work expert, I was unsure if this experiment was
> equivalent to what I proposed.
> 
> I'm always suspicious of singleton solutions like this (using
> schedule_work() in runtime_resume()) because usually they seem to be
> solving a generic problem that should happen on many kinds of
> hardware.  The 0b2fe6594fa2 ("drm/nouveau: Queue hpd_work on (runtime)
> resume") commit log says:
> 
>   We need to call drm_helper_hpd_irq_event() on resume to properly
>   detect monitor connection / disconnection on some laptops, use
>   hpd_work for this to avoid deadlocks.
> 
> The situation of a monitor being connected or disconnected during
> suspend can happen to *any* GPU, but the commit only changes nouveau,
> which of course raises the question of how we deal with that in other
> drivers.  If the Nvidia GPU has some unique behavior related to
> monitor connection, that would explain special-case code there, but
> the commit doesn't mention anything like that.
> 
> It should be simple to revert 0b2fe6594fa2 and see whether it changes
> the behavior at all (well, simple except for the fact that this
> problem isn't 100% reproducible in the first place).

It's not 100% reproducible, but it's at least 90% so it's not difficult for me
to test at all.

Also, reverting this commit makes no difference either. Note that while that
commit only changed nouveau, scheduled_work() is exactly how a number of other
drivers (i915 for instance) handle reprobing like this as well. The reason
being that we can't do full connector reprobing in our runtime resume thread
because we could deadlock if someone else is holding a modesetting lock we
need and waiting on us to resume at the same time (there's a number of other
bug fixes in nouveau for other issues caused by the same deadlock scenario). 

I'm confused here though, it sounds like you're running under the assumption
that PCI devices like this aren't reset into a clean state during a system
reboot, is that correct?

> 
> > Do we want to have this discussion on the bz btw, or is this email
> > thread fine?
> 
> Email is fine.
-- 
Cheers,
	Lyude Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ