linux-kernel - Re: [PATCH v5 5/5] PCI: Work around PCIe link training failures

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.2211290004150.58543@angie.orcam.me.uk>
Date:   Tue, 29 Nov 2022 09:57:51 +0000 (GMT)
From:   "Maciej W. Rozycki" <macro@...am.me.uk>
To:     Alex Williamson <alex.williamson@...hat.com>
cc:     Bjorn Helgaas <helgaas@...nel.org>,
        Pali Rohár <pali@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>, Stefan Roese <sr@...x.de>,
        Jim Wilson <wilson@...iptree.org>,
        David Abdurachmanov <david.abdurachmanov@...il.com>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 5/5] PCI: Work around PCIe link training failures

On Wed, 9 Nov 2022, Alex Williamson wrote:

> > 05:00.0 supports the "bus" method, i.e., pci_reset_bus_function(),
> > which tries pci_dev_reset_slot_function() followed by
> > pci_parent_bus_reset().  Both of them return -ENOTTY if the device
> > (05:00.0) has a secondary bus ("dev->subordinate"), so I think nothing
> > happens here.
> 
> Right, the pci-sysfs reset attribute is only meant for a reset scope
> limited to the device, we'd need something to call pci_reset_bus() to
> commit to the whole hierarchy, which is not something we typically do.
> vfio-pci will only bind to endpoint devices, so it shouldn't provide an
> interface to inject a bus reset here either.
> 
> Based on the fact that there's a pericom switch in play here, I'll just
> note that I think this is the same device with other link speed issues
> as well:
> 
> https://lore.kernel.org/all/20161026180140.23495.27388.stgit@gimli.home/

 Thanks for the pointer.  This has been superseded by commit acd61ffb2f16 
("PCI: Add ACS quirk for Pericom PI7C9X2G switches"), right?  In which 
case it is a match ([12d8:2304]), though the quirk does not trigger here, 
i.e. no message is printed about store-forward mode activation:

pcieport 0000:05:00.0: calling  pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pcieport 0000:05:00.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 0 usecs
[...]
pci 0000:05:00.0: calling  pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pci 0000:05:00.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 0 usecs
[...]
pcieport 0000:06:01.0: calling  pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pcieport 0000:06:01.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 3 usecs
[...]
pcieport 0000:06:02.0: calling  pci_fixup_pericom_acs_store_forward+0x0/0xba @ 1
pcieport 0000:06:02.0: pci_fixup_pericom_acs_store_forward+0x0/0xba took 2 usecs

NB I don't know why the quirk for the upstream port (05:00.0) is called 
twice, both via pcieport and via pci.

> This fell off my plate some time ago, but as noted there, enabling ACS
> when the upstream and downstream ports run at different link rates
> exposes errata where packets are queued and not delivered within the
> switch.
> 
> Could enabling ACS on this device be contributing to the issue here,
> for example triggering the Asmedia downstream port to get into this
> link reseting issue?  A test with
> pci=disable_acs_redir=0000:06:01.0;0000:06:02.0 could be interesting
> assuming this occurs on an platform that has an IOMMU, ie. calls
> pci_request_acs().  Thanks,

 We have no IOMMU support for any RISC-V machine at the moment:

config ARCH_RV64I
	[...]
	select SWIOTLB if MMU

and:

software IO TLB: area num 4.
software IO TLB: mapped [mem 0x00000000fb732000-0x00000000ff732000] (64MB)

so IIUC this issue does not apply.  Thank you for your input.

  Maciej