[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb42f26c-ee15-4ff9-a8fb-09669b727ced@intel.com>
Date: Wed, 19 Feb 2025 14:30:25 -0800
From: Jacob Keller <jacob.e.keller@...el.com>
To: Simon Horman <horms@...nel.org>
CC: netdev <netdev@...r.kernel.org>, Anthony Nguyen
<anthony.l.nguyen@...el.com>, Karol Kolacinski <karol.kolacinski@...el.com>
Subject: Re: [PATCH iwl-net] ice: ensure periodic output start time is in the
future
On 2/15/2025 6:59 AM, Simon Horman wrote:
> On Wed, Feb 12, 2025 at 03:54:39PM -0800, Jacob Keller wrote:
>> From: Karol Kolacinski <karol.kolacinski@...el.com>
>>
>> On E800 series hardware, if the start time for a periodic output signal is
>> programmed into GLTSYN_TGT_H and GLTSYN_TGT_L registers, the hardware logic
>> locks up and the periodic output signal never starts. Any future attempt to
>> reprogram the clock function is futile as the hardware will not reset until
>> a power on.
>>
>> The ice_ptp_cfg_perout function has logic to prevent this, as it checks if
>> the requested start time is in the past. If so, a new start time is
>> calculated by rounding up.
>>
>> Since commit d755a7e129a5 ("ice: Cache perout/extts requests and check
>> flags"), the rounding is done to the nearest multiple of the clock period,
>> rather than to a full second. This is more accurate, since it ensures the
>> signal matches the user request precisely.
>>
>> Unfortunately, there is a race condition with this rounding logic. If the
>> current time is close to the multiple of the period, we could calculate a
>> target time that is extremely soon. It takes time for the software to
>> program the registers, during which time this requested start time could
>> become a start time in the past. If that happens, the periodic output
>> signal will lock up.
>>
>> For large enough periods, or for the logic prior to the mentioned commit,
>> this is unlikely. However, with the new logic rounding to the period and
>> with a small enough period, this becomes inevitable.
>>
>> For example, attempting to enable a 10MHz signal requires a period of 100
>> nanoseconds. This means in the *best* case, we have 99 nanoseconds to
>> program the clock output. This is essentially impossible, and thus such a
>> small period practically guarantees that the clock output function will
>> lock up.
>>
>> To fix this, add some slop to the clock time used to check if the start
>> time is in the past. Because it is not critical that output signals start
>> immediately, but it *is* critical that we do not brick the function, 0.5
>> seconds is selected. This does mean that any requested output will be
>> delayed by at least 0.5 seconds.
>>
>> This slop is applied before rounding, so that we always round up to the
>> nearest multiple of the period that is at least 0.5 seconds in the future,
>> ensuring a minimum of 0.5 seconds to program the clock output registers.
>>
>> Finally, to ensure that the hardware registers programming the clock output
>> complete in a timely manner, add a write flush to the end of
>> ice_ptp_write_perout. This ensures we don't risk any issue with PCIe
>> transaction batching.
>>
>> Strictly speaking, this fixes a race condition all the way back at the
>> initial implementation of periodic output programming, as it is
>> theoretically possible to trigger this bug even on the old logic when
>> always rounding to a full second. However, the window is narrow, and the
>> code has been refactored heavily since then, making a direct backport not
>> apply cleanly.
>>
>> Fixes: d755a7e129a5 ("ice: Cache perout/extts requests and check flags")
>> Signed-off-by: Karol Kolacinski <karol.kolacinski@...el.com>
>> Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
>
> Thanks for the excellent patch description.
>
> Reviewed-by: Simon Horman <horms@...nel.org>
>
> ...
Lol.. can't get anything right this week.. Just noticed I had already
sent this but forgot CC to Intel Wired LAN, so it wasn't showing up in
our patchwork.
Powered by blists - more mailing lists