lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 21 Oct 2019 18:40:50 +0200
From:   Karol Herbst <kherbst@...hat.com>
To:     Mika Westerberg <mika.westerberg@...el.com>
Cc:     Bjorn Helgaas <helgaas@...nel.org>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>,
        LKML <linux-kernel@...r.kernel.org>,
        Lyude Paul <lyude@...hat.com>,
        Linux PCI <linux-pci@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        nouveau <nouveau@...ts.freedesktop.org>,
        Linux ACPI Mailing List <linux-acpi@...r.kernel.org>
Subject: Re: [PATCH v3] pci: prevent putting nvidia GPUs into lower device
 states on certain intel bridges

On Mon, Oct 21, 2019 at 5:46 PM Mika Westerberg
<mika.westerberg@...el.com> wrote:
>
> On Mon, Oct 21, 2019 at 04:49:09PM +0200, Karol Herbst wrote:
> > On Mon, Oct 21, 2019 at 4:09 PM Mika Westerberg
> > <mika.westerberg@...el.com> wrote:
> > >
> > > On Mon, Oct 21, 2019 at 03:54:09PM +0200, Karol Herbst wrote:
> > > > > I really would like to provide you more information about such
> > > > > workaround but I'm not aware of any ;-) I have not seen any issues like
> > > > > this when D3cold is properly implemented in the platform.  That's why
> > > > > I'm bit skeptical that this has anything to do with specific Intel PCIe
> > > > > ports. More likely it is some power sequence in the _ON/_OFF() methods
> > > > > that is run differently on Windows.
> > > >
> > > > yeah.. maybe. I really don't know what's the actual root cause. I just
> > > > know that with this workaround it works perfectly fine on my and some
> > > > other systems it was tested on. Do you know who would be best to
> > > > approach to get proper documentation about those methods and what are
> > > > the actual prerequisites of those methods?
> > >
> > > Those should be documented in the ACPI spec. Chapter 7 should explain
> > > power resources and the device power methods in detail.
> >
> > either I looked up the wrong spec or the documentation isn't really
> > saying much there.
>
> Well it explains those methods, _PSx, _PRx and _ON()/_OFF(). In case of
> PCIe device you also want to check PCIe spec. PCIe 5.0 section 5.8 "PCI
> Function Power State Transitions" has a picture about the supported
> power state transitions and there we can find that function must be in
> D3hot before it can be transitioned into D3cold so if the _OFF() for
> example blindly assumes that the device is in D0 when it is called, it
> is a bug in the BIOS.
>
> BTW, where can I find acpidump of such system?

I am sure it's uploaded somewhere already. But it's not an issue of
just one system. It's essentially hitting every single laptop with a
skylake or kaby lake CPU + Nvidia GPU. I didn't see any system where
it's actually working right now (and we are pestering nvidia about
this issue for over a year already with no solution)

I've attached an acpidump from my system.

Download attachment "xps_9560.tar.xz" of type "application/x-xz" (286880 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ