linux-kernel - Re: pciehp is broken from 4.10-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20944266.mG6gYdZZah@aspire.rjw.lan>
Date:   Mon, 06 Feb 2017 12:49:07 +0100
From:   "Rafael J. Wysocki" <rjw@...ysocki.net>
To:     Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc:     Lukas Wunner <lukas@...ner.de>, Yinghai Lu <yinghai@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: pciehp is broken from 4.10-rc1

On Monday, February 06, 2017 12:37:06 PM Mika Westerberg wrote:
> On Sun, Feb 05, 2017 at 08:34:54AM +0100, Lukas Wunner wrote:
> > > sca05-0a81fd8d:~ # echo 1 > /sys/bus/pci/slots/11/power
> > > [  375.376609] pci_hotplug: power_write_file: power = 1
> > > [  375.382175] pciehp 0000:b3:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
> > > [  375.392695] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  375.401370] pciehp 0000:b3:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
> > > [  375.410231] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
> > > [  375.411071] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  375.445222] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  377.444400] pciehp 0000:b3:00.0:pcie004: Data Link Layer Link Active not set in 1000 msec
> > > [  378.960364] pci 0000:b4:00.0 id reading try 50 times with interval 20 ms to get ffffffff
> > > [  378.969406] pciehp 0000:b3:00.0:pcie004: pciehp_check_link_status: lnk_status = 5001
> > > [  378.978059] pciehp 0000:b3:00.0:pcie004: link training error: status 0x5001
> > > [  378.985834] pciehp 0000:b3:00.0:pcie004: Failed to check link status
> > > [  378.987185] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  378.987253] pciehp 0000:b3:00.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
> > > [  380.000409] pciehp 0000:b3:00.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
> > > [  380.000674] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> > > [  380.018020] pciehp 0000:b3:00.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> > > [  380.019053] pciehp 0000:b3:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> 
> It would be good to see the output when 68db9bc is reverted. Yinghai,
> can you attach that to the bugzilla but as well?
> 
> > So on this Skylake machine link training fails after resuming from D3hot
> > to D0.
> > 
> > One thing that's a bit fishy is that normally the Link Disable bit is
> > cleared when powering on the slot.  This results in a debug message
> > in dmesg containg the string "lnk_ctrl = ", and that line is missing
> > from the output you've pasted above, suggesting that the machine is
> > not running a stock v4.10 kernel after all but something else.  Could
> > you check why this message is not printed?  Could you check with lspci
> > if the Link Disable bit is set before you invoke "echo 1"?
> > 
> > This is the call stack:
> > pciehp_sysfs_enable_slot()
> >   pciehp_enable_slot()
> >     board_added()
> >       pciehp_power_on_slot()
> >         pciehp_link_enable()
> >           __pciehp_link_set()
> > 
> > Another theory is that the link is generally unreliable on this machine
> > since the Link Bandwidth Management Status bit is set in the Link Status
> > Register ("lnk_status = 5001"), which according to the spec means:
> > 
> > "Hardware has changed Link speed or width to attempt to correct unreliable
> > Link operation, either through an LTSSM timeout or a higher level process.
> > This bit must be set if the Physical Layer reports a speed or width change
> > was initiated by the Downstream component that was not indicated as an
> > autonomous change."
> > 
> > In this case it would be good to know which hardware exactly we're dealing
> > with so that we might quirk it to not runtime suspend the port.  To that
> > end, could you attach a full dmesg log to the bugzilla entry I've created?
> > https://bugzilla.kernel.org/show_bug.cgi?id=193951
> > 
> > @Mika, Rafael: Are you aware of Skylake machines with unreliable link
> > training, or perhaps errata of Skylake chips related to link training
> > on hotplug ports?
> 
> According to the 100-series (the chipset used with Skylake) errata
> below, I don't see any mentions related to PCIe link training issues.
> 
> http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/100-series-chipset-spec-update.pdf

Still, it does look like errata to me.

At least I don't see what can be done on the software side to avoid this from
happening except for leaving the port(s) in question in D0.

Thanks,
Rafael