[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <IA1PR11MB653723B4B25B0EFDA02C6D8A9A11A@IA1PR11MB6537.namprd11.prod.outlook.com>
Date: Fri, 19 Sep 2025 12:57:18 +0000
From: "Shalev, Avi" <avi.shalev@...el.com>
To: Vadim Fedorenko <vadim.fedorenko@...ux.dev>, "Choong, Chwee Lin"
<chwee.lin.choong@...el.com>, "Keller, Jacob E" <jacob.e.keller@...el.com>,
"Nguyen, Anthony L" <anthony.l.nguyen@...el.com>, "Kitszel, Przemyslaw"
<przemyslaw.kitszel@...el.com>, Andrew Lunn <andrew+netdev@...n.ch>, "David S
. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, "Jakub
Kicinski" <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Richard Cochran
<richardcochran@...il.com>, "Gomes, Vinicius" <vinicius.gomes@...el.com>
CC: "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Song, Yoong
Siang" <yoong.siang.song@...el.com>
Subject: RE: [Intel-wired-lan] [PATCH iwl-net v1] igc: fix race condition in
TX timestamp read for register 0
Hello
Please see my comment at the end of this thread.
Avi Shalev
i226 design team
-----Original Message-----
From: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
Sent: Friday, September 19, 2025 1:56 PM
To: Choong, Chwee Lin <chwee.lin.choong@...el.com>; Keller, Jacob E <jacob.e.keller@...el.com>; Nguyen, Anthony L <anthony.l.nguyen@...el.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@...el.com>; Andrew Lunn <andrew+netdev@...n.ch>; David S . Miller <davem@...emloft.net>; Eric Dumazet <edumazet@...gle.com>; Jakub Kicinski <kuba@...nel.org>; Paolo Abeni <pabeni@...hat.com>; Richard Cochran <richardcochran@...il.com>; Gomes, Vinicius <vinicius.gomes@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org; netdev@...r.kernel.org; linux-kernel@...r.kernel.org; Shalev, Avi <avi.shalev@...el.com>; Song, Yoong Siang <yoong.siang.song@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v1] igc: fix race condition in TX timestamp read for register 0
On 19/09/2025 08:17, Choong, Chwee Lin wrote:
>
> On Friday, September 19, 2025 6:11 AM, Keller, Jacob E <jacob.e.keller@...el.com> wrote:
>> On 9/18/2025 1:47 PM, Vadim Fedorenko wrote:
>>> On 18/09/2025 19:38, Chwee-Lin Choong wrote:
>>>> The current HW bug workaround checks the TXTT_0 ready bit first,
>>>> then reads LOW -> HIGH -> LOW from register 0 to detect if a
>>>> timestamp was captured.
>>>>
>>>> This sequence has a race: if a new timestamp is latched after
>>>> reading the TXTT mask but before the first LOW read, both old and
>>>> new timestamp match, causing the driver to drop a valid timestamp.
>>>>
>>>> Fix by reading the LOW register first, then the TXTT mask, so a
>>>> newly latched timestamp will always be detected.
>>>>
>>>> This fix also prevents TX unit hangs observed under heavy
>>>> timestamping load.
>>>>
>>>> Fixes: c789ad7cbebc ("igc: Work around HW bug causing missing
>>>> timestamps")
>>>> Suggested-by: Avi Shalev <avi.shalev@...el.com>
>>>> Signed-off-by: Song Yoong Siang <yoong.siang.song@...el.com>
>>>> Signed-off-by: Chwee-Lin Choong <chwee.lin.choong@...el.com>
>>>> ---
>>>> drivers/net/ethernet/intel/igc/igc_ptp.c | 10 ++++++++--
>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>>
>>>
>>> [...]
>>>
>>>> * timestamp was captured, we can read the "high"
>>>> * register again.
>>>> */
>>>
>>> This comment begins with 'read the "high" register (to latch a new
>>> timestamp)' ...
>>>
>>>> - u32 txstmpl_old, txstmpl_new;
>>>> + u32 txstmpl_new;
>>>>
>>>> - txstmpl_old = rd32(IGC_TXSTMPL);
>>>> rd32(IGC_TXSTMPH);
>>>> txstmpl_new = rd32(IGC_TXSTMPL);
>>>
>>> and a couple of lines later in this function you have
>>>
>>> regval = txstmpl_new;
>>> regval |= (u64)rd32(IGC_TXSTMPH) << 32;
>>>
>>> According to the comment above, the value in the register will be
>>> latched after reading IGC_TXSTMPH. As there will be no read of "low"
>>> part of the register, it will stay latched with old value until the
>>> next call to the same function. Could it be the reason of unit hangs?
>>>
>>> It looks like the value of previous read of IGC_TXSTMPH should be
>>> stored and used to construct new timestamp, right?
>>>
>>
>> I wouldn't trust the comment, but instead double check the data sheets.
>> Unfortunately, I don't seem to have a copy of the igc hardware data
>> sheet handy :(
>>
>> Thanks,
>> Jake
>
> Flow before this patch:
> 1. Read the TXTT bits into mask
> 2. if TXTT_0 == 0, go to workaround ->If at this point register 0 captures TX timestamp, and TXTT_0 is set but we think it is 0.
> 3. Read LOW to OLD
> 4. Read HIGH – this clears the TXTT_0
> 5. Read LOW again , now to NEW.
> 6. NEW==OLD, so the timestamp is discarded -> causing timestamp
> timeout
>
> Flow after this patch:
> 1. Read LOW to OLD
> 2. Read the TXTT bits into mask
> 3. if TXTT_0 == 0, go to workaround -> If at this point register 0 captures TX timestamp, and TXTT_0 is set but we think it is 0.
> 4. Read HIGH – this clears the TXTT_0
> 5. Read LOW again , now to NEW.
> 6. NEW!=OLD, so we detect this is a valid timestamp
> 7. Read HIGH again and use the timestamp
>
> Let me know if this address your questions?
> Unfortunately, it doesn't. The question is "what will happen to register after step 7?" The comment above says it will stay latched until LOW is read, will it affect performance/stability?
[Avi] The low and high are latched when a packet (with a time stamp bit set in the descriptor ) was transmitted.
They will not change until next time SW will send a new packet with a time stamp bit set in the descriptor for this timestamp register (1 of 4).
When a timestamp is latched, the associated TXTT_* bit is set and interrupt event bit is set (TSICR.TXTS) .
Reading the high part of a timestamp register clears the associated TXTT_* bit.
The HW bug is that TXSTMPH_0 (specifically) must be read as part of the interrupt routine, otherwise the interrupt event bit (TSICR.TXTS) will not be set again.
The problem is how to know if TXSTMPH/L_0 contain a new valid timestamp or it is an old value that was already read.
If TXTT_0 is 1, TXSTMPH/L_0 contain a new valid timestamp. This part is simple.
If TXTT_0 is 0, and then we read TXSTMPH_0, we can't assume that it is old because it is possible that a new timestamp was latched between reading TXTT_0 and reading TXSTMPH_0.
The "before" code is a workaround for this HW bug. The W/A was not perfect hence the new patch is intended to fix it.
The original W/A was solving the case where a new timestamp was captured between steps 3 and 4.
The problem with the original W/A is in case the new timestamp is captured between steps 2 and 3.
The patch comes to solve this new race condition.
The errata wording and W/A are given below.
The new patch is just asking to change the order and read TXSTMPL_0 before Read TXTT_0-3 (in TSYNCTXCTL), to solve the new corner case.
I226 errata (need to update with this change):
1588 PTP: Transmit timestamp interrupt can be missed
Problem: When transmit timestamp is captured in one of the transmit timestamp registers TXSTMPH/L0-3 (offsets 0xB618 – 0xB6DC), an interrupt is raised (TSICR.TXTS). Once SW reads the updated timestamp according to one (or more) of TXTT_0-3 asserted in TSYNCTXCTL (offset 0xB614, bits 3:0), and clears the interrupt, it is expected that a new timestamp capture will assert the interrupt bit again. However, the new interrupt will assert only if SW reads TXSTMPH_0 specifically, as part of the interrupt handler (even if TXTT_0 is not set).
Implications:
Timestamp interrupts may be missed
Workaround:
A simple W/A to always read TXSTMPH_0 (offset 0xB61C) in the interrupt handler is not good enough due to possible race condition since it also clears TXTT_0, and potentially missing a timestamp that was just captured in TXSTMPH/L_0. This simple W/A can work if TXSTMPH/L_0 is never used, and only TXSTMPH/L_1-3 are used.
If all four timestamp registers are required, SW can detect the race condition mentioned above by keeping a record of TXSTMPL_0 (offset 0xB618) before reading TXSTMPH_0. So the interrupt handler should do the following:
Read TXTT_0-3 (in TSYNCTXCTL) to see which timestamp registers were updated.
Read the timestamp registers according to TXTT_0-3.
If TXTT_0 was not set:
- Read TXSTMPL_0 (sub-second part of timestamp 0), and keep a record of it.
- Read TXSTMPH_0 to W/A this issue-
- Read TXSTMPL_0 again. If it is different from before, it indicates that a new timestamp was just captured in TXSTMPH/L_0. Read TXSTMPH_0 as well and use it.
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Powered by blists - more mailing lists