lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 27 Jul 2016 16:46:16 -0500
From:	Bjorn Helgaas <helgaas@...nel.org>
To:	Joseph Salisbury <joseph.salisbury@...onical.com>
Cc:	Bjorn Helgaas <bhelgaas@...gle.com>,
	Yinghai Lu <yinghai@...nel.or>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, john_mazzie@...l.com,
	Mark Wenning <mark.wenning@...onical.com>
Subject: Re: [Regression][3.18-rc1 -> mainline] PCI: Configure *all* devices,
 not just hot-added ones

On Wed, Jul 27, 2016 at 02:23:24PM -0400, Joseph Salisbury wrote:
> A kernel bug report was opened against Ubuntu [0].  After a kernel
> bisect, it was found that reverting the following commit resolved this bug:
> 
> commit 1302fcf0d03e6ea74846c7fee14736306ab2ce4b
> Author: Bjorn Helgaas <bhelgaas@...gle.com>
> Date: Sat Aug 30 07:23:01 2014 -0600
> 
>     PCI: Configure *all* devices, not just hot-added ones
> 
> The regression was introduced as of v3.18-rc1 and the bug still exists
> in current mainline.
> 
> [0] http://pad.lv/1571798

I added the following response to the Launchpad bug report; pasting it
here for better visibility:

The register in question is the Advanced Error Capabilities and
Control register, at offset 0x18 in the Advanced Error Reporting
capability, which starts at 0x148 in the config space of device
80:02.0.

In the pre-boot value of 0x00a0, the following bits are set (per PCIe
spec r3.0, sec 7.10.7, these bits are read-only):

  PCI_ERR_CAP_ECRC_GENC 0x00000020 /* ECRC Generation Capable */
  PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */

In the value of 0x01e0 after Linux boots, the following additional
bits are set:

  PCI_ERR_CAP_ECRC_GENE 0x00000040 /* ECRC Generation Enable */
  PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */

Linux is setting these bits in program_hpp_type2() because there is
apparently an ACPI _HPX method that applies to this device, and it
returns a PCI Express setting record (ACPI spec 5.0, sec 6.2.8.3) with
an "Advanced Error Capabilities and Control Register OR Mask" that has
PCI_ERR_CAP_ECRC_GENE and PCI_ERR_CAP_ECRC_CHKE set.

Can you collect an ACPI dump to confirm that this is the case?

As I mentioned in the 1302fcf0d03e changelog, it's not completely
clear from the spec (ACPI 5.0, sec 6.2.8) when to apply these _HPX
settings. It says OSPM should use them to "configure devices not
configured by the platform firmware during initial system boot." The
question is how OSPM can tell whether a device has been configured by
platform firmware.

Since I don't know how to tell if a device has been configured by
platform firmware, I chose to apply the _HPX settings to *all*
devices.

Any BIOS folks want to suggest a way to tell whether firmware has
configured a device?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ