[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_D1EA17EA4011F145E9B70F3C08E57C37890A@qq.com>
Date: Mon, 13 Jan 2025 20:44:37 +0800
From: Jiwei <jiwei.sun.bj@...com>
To: "Maciej W. Rozycki" <macro@...am.me.uk>
Cc: ilpo.jarvinen@...ux.intel.com, Bjorn Helgaas <bhelgaas@...gle.com>,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
guojinhui.liam@...edance.com, helgaas@...nel.org, lukas@...ner.de,
ahuang12@...ovo.com, sunjw10@...ovo.com
Subject: Re: [PATCH 2/2] PCI: Fix the PCIe bridge decreasing to Gen 1 during
hotplug testing
On 1/12/25 00:00, Maciej W. Rozycki wrote:
> On Fri, 10 Jan 2025, Jiwei Sun wrote:
>
>> In order to fix the issue, don't do the retraining work except ASMedia
>> ASM2824.
>
> I yet need to go through all of your submission in detail, but this
> assumption defeats the purpose of the workaround, as the current
> understanding of the origin of the training failure and the reason to
> retrain by hand with the speed limited to 2.5GT/s is the *downstream*
> device rather than the ASMedia ASM2824 switch.
>
> It is also why the quirk has been wired to run everywhere rather than
> having been keyed by VID:DID, and the VID:DID of the switch is only
> listed, conservatively, because it seems safe with the switch to lift the
> speed restriction once the link has successfully completed training.
>
> Overall I think we need to get your problem sorted differently, because I
> suppose in principle your hot-plug scenario could also happen with the
> ASMedia ASM2824 switch as the upstream device and your NVMe storage
> element as the downstream device. Perhaps the speed restriction could be
> always lifted, and then the bandwidth controller infrastructure used for
> that, so that it doesn't have to happen within `pcie_failed_link_retrain'?
According to our test, the following modification can fix the issue in our
test machine.
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 02d2e16672a8..9ca051b86878 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -97,10 +97,6 @@ static bool pcie_lbms_seen(struct pci_dev *dev, u16 lnksta)
*/
int pcie_failed_link_retrain(struct pci_dev *dev) {
- static const struct pci_device_id ids[] = {
- { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
- {}
- };
u16 lnksta, lnkctl2;
int ret = -ENOTTY;
@@ -128,8 +124,7 @@ int pcie_failed_link_retrain(struct pci_dev *dev)
}
if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
- (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
- pci_match_id(ids, dev)) {
+ (lnkctl2 & PCI_EXP_LNKCTL2_TLS) ==
+ PCI_EXP_LNKCTL2_TLS_2_5GT) {
u32 lnkcap;
pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
But I don't know if the above modification will have any other negative effects
on other devices. Could you please share your thoughts?
Thanks,
Regards,
Jiwei
>
> Maciej
Powered by blists - more mailing lists