netdev - Re: [Intel-wired-lan] [PATCH iwl-net v3 2/6] igc: Lengthen the hardware retry time to prevent timeouts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <dc4194a7-c4aa-4132-9ca3-1121530ba1bc@molgen.mpg.de>
Date: Thu, 7 Nov 2024 06:56:19 +0100
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Christopher S Hall <christopher.s.hall@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org, David Zage <david.zage@...el.com>,
 Vinicius Gomes <vinicius.gomes@...el.com>, netdev@...r.kernel.org,
 "Cadore Cataldo, Rodrigo" <rodrigo.cadore@...coustics.com>,
 Corinna Vinschen <vinschen@...hat.com>,
 Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>,
 Mor Bar Gabay <morx.bar.gabay@...el.com>,
 Avigail Dahan <avigailx.dahan@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 2/6] igc: Lengthen the
 hardware retry time to prevent timeouts

[Cc: -Sasha, 550 #5.1.0 Address rejected.]

Am 07.11.24 um 06:48 schrieb Paul Menzel:
> [Cc: +Sasha]
> 
> Dear Christopher,
> 
> 
> Am 07.11.24 um 00:53 schrieb Hall, Christopher S:
> 
>>> From: Paul Menzel <pmenzel@...gen.mpg.de>
>>> Sent: Wednesday, November 06, 2024 3:14 PM
>>
>>> Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 2/6] igc: Lengthen the hardware retry time to prevent timeouts
> 
>>> I’d use the more specific summary/title below:
>>
>> Will do.
>>
>>> igc: Lengthen hardware retry time to 4 μs to prevent timeouts
>>>
>>> Am 06.11.24 um 19:47 schrieb Christopher S M Hall:
>>>> Lengthen the hardware retry timer to four microseconds.
>>>>
>>>> The i225/i226 hardware retries if it receives an inappropriate response
>>>> from the upstream device. If the device retries too quickly, the root
>>>> port does not respond.
>>>
>>> Any idea why? Is it documented somewhere?
>>
>> I do not. Theoretically, 1 us should work, but it does not. It could be a root
>> port problem or an issue with i225/i226 NIC. I am not able to directly observe
>> the state of either. 4 us has worked in all my testing I am comfortable with
>> that value. 2 us also works, but given the limited hardware at my disposal
>> I doubled the value to 4 us to be safe. PTM is not time critical. Typically,
>> software initiates a transaction between 8 and 32 times per second. There
>> is no performance impact for PTM or any other function of the card. The
>> timeout occurs rarely, but if the retry time is too short the PTM state
>> machine does not recover.
> 
> Thank you for clearing this up. If it’s not time critical, why not 
> revert the original patch and go back to 10 μs.
> 
> The referenced commit 6b8aa753a9f9 (igc: Decrease PTM short interval 
> from 10 us to 1 us) also says, that 1 μs was suggested by the hardware 
> team. Were you able to talk to them?
> 
>>>> The issue can be reproduced with the following:
>>>>
>>>> $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
>>>>
>>>> Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
>>>> quickly reproduce the issue.
>>>>
>>>> PHC2SYS exits with:
>>>>
>>>> "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
>>>>     fails
>>>
>>> Why four microseconds, and not some other value?
>>
>> See above.
> 
> It’d be great, if you extended the commit message.
> 
>>>> Fixes: 6b8aa753a9f9 ("igc: Decrease PTM short interval from 10 us to 
>>>> 1 us")
>>>>
>>>> -#define IGC_PTM_SHORT_CYC_DEFAULT    1   /* Default short cycle interval */
>>>> +#define IGC_PTM_SHORT_CYC_DEFAULT    4   /* Default short cycle interval */
> 
> Maybe also add a comment, that 1 μs should work, but does not.
> 
> 
> Kind regards,
> 
> Paul