[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260211182909.GA117627@bhelgaas>
Date: Wed, 11 Feb 2026 12:29:09 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Harshank Matkar <harshankmatkar1304@...look.com>
Cc: "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"tony.nguyen@...el.com" <tony.nguyen@...el.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] igc: Add PCIe link recovery for I225/I226
On Tue, Feb 10, 2026 at 08:34:02PM +0000, Harshank Matkar wrote:
> From: Harshank Matkar <harshankmatkar1304@...look.com>
>
> When ASPM L0s transitions occur on Intel I225/I226 controllers,
> transient PCIe link instability can cause register read failures
> (0xFFFFFFFF responses).
At the PCIe level, the failure is some uncorrectable PCIe error like a
Completion Timeout or Unsupported Request. The 0xFFFFFFFF response is
implementation-specific behavior determined by the Root Complex
design.
> Implement a multi-layer recovery strategy:
> 1. Immediate retries: 3 attempts with 100-200μs delays
> 2. Link retraining: Trigger PCIe link retraining via capabilities
> 3. Device detachment: Only as last resort after max attempts
>
> The recovery mechanism includes rate limiting, maximum attempt
> tracking, and device presence validation to prevent false detaches
> on transient ASPM glitches while maintaining safety through
> bounded retry limits.
I assume the glitch is a hardware erratum and should be documented as
such by Intel, although it's possible ASPM L0s isn't configured
correctly.
If it's a hardware erratum, I think you should use a quirk to disable
L0s on these devices, e.g., pci_disable_link_state(pdev,
PCIE_LINK_STATE_L0S). Even if this patch allows recovery, the PCIe
errors will be logged and reported via AER, which will be confusing to
users.
Bjorn
Powered by blists - more mailing lists