linux-kernel - Re: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout on HW failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5ee0e9beb684dcf0b19b5c0698deea033cfff588.camel@microchip.com>
Date: Wed, 8 May 2024 08:52:30 +0000
From: <Rengarajan.S@...rochip.com>
To: <andrew@...n.ch>
CC: <Bryan.Whitehead@...rochip.com>, <davem@...emloft.net>,
	<linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>,
	<pabeni@...hat.com>, <richardcochran@...il.com>, <edumazet@...gle.com>,
	<UNGLinuxDriver@...rochip.com>, <kuba@...nel.org>
Subject: Re: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout
 on HW failure

On Tue, 2024-05-07 at 03:33 +0200, Andrew Lunn wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
> 
> On Thu, May 02, 2024 at 10:33:00AM +0530, Rengarajan S wrote:
> > The PTP_CMD_CTL is a self clearing register which controls the PTP
> > clock
> > values. In the current implementation driver waits for a duration
> > of 20
> > sec in case of HW failure to clear the PTP_CMD_CTL register bit.
> > This
> > timeout of 20 sec is very long to recognize a HW failure, as it is
> > typically cleared in one clock(<16ns). Hence reducing the timeout
> > to 1 sec
> > would be sufficient to conclude if there is any HW failure
> > observed. The
> > usleep_range will sleep somewhere between 1 msec to 20 msec for
> > each
> > iteration. By setting the PTP_CMD_CTL_TIMEOUT_CNT to 50 the max
> > timeout
> > is extended to 1 sec.
> 
> This patch has already been merged, so this is just for my
> curiosity. The hardware is dead. Does it really matter if we wait 1s
> or 20 seconds. It is still dead? This is a void function. Other than
> reporting that the hardware is dead, nothing is done. So this change
> seems pointless?
> 
>         Andrew

Hi Andrew, based on the customer experience they felt that there might
be cases where the 20-sec delay can cause the issue(reporting the HW to
be dead). For boards with defects/failure on few occasions it was found
that resetting the chip can lead to successful resolution; however,
since we need to wait for 20 sec for chip reset, we found that reducing
the timeout to 1 sec would be optimal.