[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc4194a7-c4aa-4132-9ca3-1121530ba1bc@molgen.mpg.de>
Date: Thu, 7 Nov 2024 06:56:19 +0100
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Christopher S Hall <christopher.s.hall@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org, David Zage <david.zage@...el.com>,
Vinicius Gomes <vinicius.gomes@...el.com>, netdev@...r.kernel.org,
"Cadore Cataldo, Rodrigo" <rodrigo.cadore@...coustics.com>,
Corinna Vinschen <vinschen@...hat.com>,
Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>,
Mor Bar Gabay <morx.bar.gabay@...el.com>,
Avigail Dahan <avigailx.dahan@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 2/6] igc: Lengthen the
hardware retry time to prevent timeouts
[Cc: -Sasha, 550 #5.1.0 Address rejected.]
Am 07.11.24 um 06:48 schrieb Paul Menzel:
> [Cc: +Sasha]
>
> Dear Christopher,
>
>
> Am 07.11.24 um 00:53 schrieb Hall, Christopher S:
>
>>> From: Paul Menzel <pmenzel@...gen.mpg.de>
>>> Sent: Wednesday, November 06, 2024 3:14 PM
>>
>>> Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 2/6] igc: Lengthen the hardware retry time to prevent timeouts
>
>>> I’d use the more specific summary/title below:
>>
>> Will do.
>>
>>> igc: Lengthen hardware retry time to 4 μs to prevent timeouts
>>>
>>> Am 06.11.24 um 19:47 schrieb Christopher S M Hall:
>>>> Lengthen the hardware retry timer to four microseconds.
>>>>
>>>> The i225/i226 hardware retries if it receives an inappropriate response
>>>> from the upstream device. If the device retries too quickly, the root
>>>> port does not respond.
>>>
>>> Any idea why? Is it documented somewhere?
>>
>> I do not. Theoretically, 1 us should work, but it does not. It could be a root
>> port problem or an issue with i225/i226 NIC. I am not able to directly observe
>> the state of either. 4 us has worked in all my testing I am comfortable with
>> that value. 2 us also works, but given the limited hardware at my disposal
>> I doubled the value to 4 us to be safe. PTM is not time critical. Typically,
>> software initiates a transaction between 8 and 32 times per second. There
>> is no performance impact for PTM or any other function of the card. The
>> timeout occurs rarely, but if the retry time is too short the PTM state
>> machine does not recover.
>
> Thank you for clearing this up. If it’s not time critical, why not
> revert the original patch and go back to 10 μs.
>
> The referenced commit 6b8aa753a9f9 (igc: Decrease PTM short interval
> from 10 us to 1 us) also says, that 1 μs was suggested by the hardware
> team. Were you able to talk to them?
>
>>>> The issue can be reproduced with the following:
>>>>
>>>> $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
>>>>
>>>> Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
>>>> quickly reproduce the issue.
>>>>
>>>> PHC2SYS exits with:
>>>>
>>>> "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
>>>> fails
>>>
>>> Why four microseconds, and not some other value?
>>
>> See above.
>
> It’d be great, if you extended the commit message.
>
>>>> Fixes: 6b8aa753a9f9 ("igc: Decrease PTM short interval from 10 us to
>>>> 1 us")
>>>>
>>>> -#define IGC_PTM_SHORT_CYC_DEFAULT 1 /* Default short cycle interval */
>>>> +#define IGC_PTM_SHORT_CYC_DEFAULT 4 /* Default short cycle interval */
>
> Maybe also add a comment, that 1 μs should work, but does not.
>
>
> Kind regards,
>
> Paul
Powered by blists - more mailing lists