linux-kernel - Re: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during hotplug testing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.2501111543050.18889@angie.orcam.me.uk>
Date: Sat, 11 Jan 2025 16:00:40 +0000 (GMT)
From: "Maciej W. Rozycki" <macro@...am.me.uk>
To: Jiwei Sun <jiwei.sun.bj@...com>
cc: ilpo.jarvinen@...ux.intel.com, Bjorn Helgaas <bhelgaas@...gle.com>, 
    linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org, 
    guojinhui.liam@...edance.com, helgaas@...nel.org, lukas@...ner.de, 
    ahuang12@...ovo.com, sunjw10@...ovo.com
Subject: Re: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during
 hotplug testing

On Fri, 10 Jan 2025, Jiwei Sun wrote:

> In order to fix the issue, don't do the retraining work except ASMedia
> ASM2824.

 I yet need to go through all of your submission in detail, but this 
assumption defeats the purpose of the workaround, as the current 
understanding of the origin of the training failure and the reason to 
retrain by hand with the speed limited to 2.5GT/s is the *downstream* 
device rather than the ASMedia ASM2824 switch.

 It is also why the quirk has been wired to run everywhere rather than
having been keyed by VID:DID, and the VID:DID of the switch is only 
listed, conservatively, because it seems safe with the switch to lift the 
speed restriction once the link has successfully completed training.

 Overall I think we need to get your problem sorted differently, because I 
suppose in principle your hot-plug scenario could also happen with the 
ASMedia ASM2824 switch as the upstream device and your NVMe storage 
element as the downstream device.  Perhaps the speed restriction could be 
always lifted, and then the bandwidth controller infrastructure used for 
that, so that it doesn't have to happen within `pcie_failed_link_retrain'?

  Maciej