[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250305222016.GA316198@bhelgaas>
Date: Wed, 5 Mar 2025 16:20:16 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: hans.zhang@...tech.com
Cc: bhelgaas@...gle.com, cix-kernel-upstream@...tech.com,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
Peter Chen <peter.chen@...tech.com>, ChunHao Lin <hau@...ltek.com>,
Heiner Kallweit <hkallweit1@...il.com>, nic_swsd@...ltek.com,
netdev@...r.kernel.org
Subject: Re: [PATCH] PCI: Add PCI quirk to disable L0s ASPM state for RTL8125
2.5GbE Controller
[+cc r8169 maintainers, since upstream r8169 claims device 0x8125]
On Wed, Mar 05, 2025 at 02:30:35PM +0800, hans.zhang@...tech.com wrote:
> From: Hans Zhang <hans.zhang@...tech.com>
>
> This patch is intended to disable L0s ASPM link state for RTL8125 2.5GbE
> Controller due to the fact that it is possible to corrupt TX data when
> coming back out of L0s on some systems. This quirk uses the ASPM api to
> prevent the ASPM subsystem from re-enabling the L0s state.
Sounds like this should be a documented erratum. Realtek folks? Or
maybe an erratum on the other end of the link, which looks like a CIX
Root Port:
https://admin.pci-ids.ucw.cz/read/PC/1f6c/0001
If it's a CIX Root Port defect, it could affect devices other than
RTL8125.
> And it causes the following AER errors:
> pcieport 0003:30:00.0: AER: Multiple Corrected error received: 0003:31:00.0
> pcieport 0003:30:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
> pcieport 0003:30:00.0: device [1f6c:0001] error status/mask=00001000/0000e000
> pcieport 0003:30:00.0: [12] Timeout
> r8125 0003:31:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
> r8125 0003:31:00.0: device [10ec:8125] error status/mask=00001000/0000e000
> r8125 0003:31:00.0: [12] Timeout
> r8125 0003:31:00.0: AER: Error of this Agent is reported first
Looks like a driver name of "r8125", but I don't see that upstream.
Is this an out-of-tree driver?
> And the RTL8125 website does not say that it supports L0s. It only supports
> L1 and L1ss.
>
> RTL8125 website: https://www.realtek.com/Product/Index?id=3962
I don't think it matters what the web site says. Apparently the
device advertises L0s support via Link Capabilities.
> Signed-off-by: Hans Zhang <hans.zhang@...tech.com>
> Reviewed-by: Peter Chen <peter.chen@...tech.com>
> ---
> drivers/pci/quirks.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 82b21e34c545..5f69bb5ee3ff 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2514,6 +2514,12 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s);
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s);
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
>
> +/*
> + * The RTL8125 may experience data corruption issues when transitioning out
> + * of L0S. To prevent this we need to disable L0S on the PCIe link.
> + */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, 0x8125, quirk_disable_aspm_l0s);
> +
> static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> {
> pci_info(dev, "Disabling ASPM L0s/L1\n");
>
> base-commit: 99fa936e8e4f117d62f229003c9799686f74cebc
> --
> 2.47.1
>
Powered by blists - more mailing lists