lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240828205357.GA36177@bhelgaas>
Date: Wed, 28 Aug 2024 15:53:57 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Lukas Wunner <lukas@...ner.de>
Cc: linux-pci@...r.kernel.org,
	Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
	"Maciej W . Rozycki" <macro@...am.me.uk>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
	Mario Limonciello <mario.limonciello@....com>,
	Duc Dang <ducdang@...gle.com>,
	Alex Williamson <alex.williamson@...hat.com>,
	linux-kernel@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
	Stanislav Spassov <stanspas@...zon.de>,
	Rajat Jain <rajatja@...gle.com>
Subject: Re: [RFC PATCH 1/3] PCI: Wait for device readiness with
 Configuration RRS

[+cc Stanislav, Rajat]

On Wed, Aug 28, 2024 at 06:17:04AM +0200, Lukas Wunner wrote:
> On Tue, Aug 27, 2024 at 06:48:46PM -0500, Bjorn Helgaas wrote:
> > @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
> >  			return -ENOTTY;
> >  		}
> >  
> > -		pci_read_config_dword(dev, PCI_COMMAND, &id);
> > -		if (!PCI_POSSIBLE_ERROR(id))
> > -			break;
> > +		if (root && root->config_crs_sv) {
> > +			pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
> > +			if (!pci_bus_crs_vendor_id(id))
> > +				break;
> 
> There was an effort from Amazon back in 2020/2021 to improve CRS support:
> 
> https://lore.kernel.org/linux-pci/20200307172044.29645-1-stanspas@amazon.com/

Thanks for reminding me of that, and I'm sorry that series didn't get
applied back then because it's very similar to this one.

> One suggestion you raised in the subsequent discussion was to use a
> 16-bit (word) instead of a 32-bit (dword) read of the Vendor ID
> register to avoid issues with devices that don't implement CRS SV
> correctly:
> 
> https://lore.kernel.org/linux-pci/20210913160745.GA1329939@bjorn-Precision-5520/

Thanks.  Reading that response, I don't understand my point about using
a 16-bit read.  I mentioned 89665a6a7140 ("PCI: Check only the Vendor
ID to identify Configuration Request Retry"), the commit log of which
points to http://lkml.kernel.org/r/4729FC36.3040000@gmail.com, which
documents several defective devices that have a Vendor ID of 0x0001.

E.g., the Realtek rtl8169 controller mentioned there has Vendor/Device
ID of [0001:8168].  Doing either a 16-bit read or a 32-bit read and
checking the low 16 bits would incorrectly treat that as a Config RRS
completion.

So it *looks* to me like we will time out after 60 seconds and
conclude the device never became ready:

  pci_scan_device(..., timeout=60*1000)
    pci_bus_read_dev_vendor_id
      pci_bus_generic_read_dev_vendor_id
        pci_bus_read_config_dword(PCI_VENDOR_ID, l)  # <--
        # *l == 0x81680001
        if (pci_bus_crs_vendor_id(*l))        # 0x81680001 & 0xffff = 0x0001
          pci_bus_wait_crs(..., timeout=60*1000)
            while (pci_bus_crs_vendor_id(*l)) {
              if (delay > timeout)
                return false;                 # device not ready
              pci_bus_read_config_dword(PCI_VENDOR_ID, l)
            }

That *might* be an argument for doing a 32-bit read and checking for
0xffff0001, since the spec requires all 1's in the extra bytes.  But
I'm not aware of any complaints about these broken devices with a
0x0001 Vendor ID, and the 89665a6a7140 commit log says there are also
defective devices that don't fill the extra bytes with all 1's.

So my inclination is to keep the current code that does a 32-bit read
and checks only the low 16 bits.

> I realize that pci_bus_crs_vendor_id() masks the Device ID bits,
> so probably no biggie.  Just want to make sure all lessons learned
> during previous discussions on this topic are considered. :)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ