[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87y0w74m58.fsf@intel.com>
Date: Thu, 10 Apr 2025 16:44:35 -0700
From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
To: Jacob Keller <jacob.e.keller@...el.com>, Anthony Nguyen
<anthony.l.nguyen@...el.com>
Cc: david.zage@...el.com, rodrigo.cadore@...coustics.com,
intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org, Jacob Keller
<jacob.e.keller@...el.com>, Christopher S M Hall
<christopher.s.hall@...el.com>, Michal Swiatkowski
<michal.swiatkowski@...ux.intel.com>, Mor Bar-Gabay
<morx.bar.gabay@...el.com>, Avigail Dahan <avigailx.dahan@...el.com>,
Corinna
Vinschen <vinschen@...hat.com>
Subject: Re: [PATCH iwl-net v4 0/6] igc: Fix PTM timeout
Hi,
Jacob Keller <jacob.e.keller@...el.com> writes:
> There have been sporadic reports of PTM timeouts using i225/i226 devices
>
> These timeouts have been root caused to:
>
> 1) Manipulating the PTM status register while PTM is enabled and triggered
> 2) The hardware retrying too quickly when an inappropriate response is
> received from the upstream device
>
> The issue can be reproduced with the following:
>
> $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
>
> Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
> quickly reproduce the issue.
>
> PHC2SYS exits with:
>
> "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
> fails
>
> The first patch in this series also resolves an issue reported by Corinna
> Vinschen relating to kdump:
>
> This patch also fixes a hang in igc_probe() when loading the igc
> driver in the kdump kernel on systems supporting PTM.
>
> The igc driver running in the base kernel enables PTM trigger in
> igc_probe(). Therefore the driver is always in PTM trigger mode,
> except in brief periods when manually triggering a PTM cycle.
>
> When a crash occurs, the NIC is reset while PTM trigger is enabled.
> Due to a hardware problem, the NIC is subsequently in a bad busmaster
> state and doesn't handle register reads/writes. When running
> igc_probe() in the kdump kernel, the first register access to a NIC
> register hangs driver probing and ultimately breaks kdump.
>
> With this patch, igc has PTM trigger disabled most of the time,
> and the trigger is only enabled for very brief (10 - 100 us) periods
> when manually triggering a PTM cycle. Chances that a crash occurs
> during a PTM trigger are not zero, but extremly reduced.
>
> Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
> ---
> Changes in v4:
> - Jacob taking over sending v4 due to lack of time on Chris's part.
> - Updated commit messages based on review feedback from v3
> - Updated commit titles to slightly more imperative wording
> - Link to v3: https://lore.kernel.org/r/20241106184722.17230-1-christopher.s.hall@intel.com
> Changes in v3:
> - Added mutex_destroy() to clean up PTM lock.
> - Added missing checks for PTP enabled flag called from igc_main.c.
> - Cleanup PTP module if probe fails.
> - Wrap all access to PTM registers with PTM lock/unlock.
> - Link to v2: https://lore.kernel.org/netdev/20241023023040.111429-1-christopher.s.hall@intel.com/
> Changes in v2:
> - Removed patch modifying PTM retry loop count.
> - Moved PTM mutex initialization from igc_reset() to igc_ptp_init(), called
> once during igc_probe().
> - Link to v1: https://lore.kernel.org/netdev/20240807003032.10300-1-christopher.s.hall@intel.com/
>
> ---
> Christopher S M Hall (6):
> igc: fix PTM cycle trigger logic
> igc: increase wait time before retrying PTM
> igc: move ktime snapshot into PTM retry loop
> igc: handle the IGC_PTP_ENABLED flag correctly
> igc: cleanup PTP module if probe fails
> igc: add lock preventing multiple simultaneous PTM transactions
>
For the series:
Acked-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
Cheers,
--
Vinicius
Powered by blists - more mailing lists