lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250410180205.455d8488@kmaincent-XPS-13-7390>
Date: Thu, 10 Apr 2025 18:02:05 +0200
From: Kory Maincent <kory.maincent@...tlin.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Simon Horman <horms@...nel.org>, Andrew Lunn <andrew@...n.ch>, Heiner
 Kallweit <hkallweit1@...il.com>, "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo
 Abeni <pabeni@...hat.com>, Marek Behún <kabel@...nel.org>,
 Richard Cochran <richardcochran@...il.com>, Thomas Petazzoni
 <thomas.petazzoni@...tlin.com>, Maxime Chevallier
 <maxime.chevallier@...tlin.com>, linux-kernel@...r.kernel.org,
 netdev@...r.kernel.org
Subject: Re: [PATCH net-next v2 2/2] net: phy: Add Marvell PHY PTP support

On Thu, 10 Apr 2025 16:41:06 +0100
"Russell King (Oracle)" <linux@...linux.org.uk> wrote:

> On Thu, Apr 10, 2025 at 11:17:54AM +0200, Kory Maincent wrote:
> > On Wed, 9 Apr 2025 23:38:00 +0100
> > "Russell King (Oracle)" <linux@...linux.org.uk> wrote:  
> > > On Wed, Apr 09, 2025 at 06:34:35PM +0100, Russell King (Oracle) wrote:  

> > > 
> > > With that fixed, ptp4l's output looks very similar to that with mvpp2 -
> > > which doesn't inspire much confidence that the ptp stack is operating
> > > properly with the offset and frequency varying all over the place, and
> > > the "delay timeout" messages spamming frequently. I'm also getting
> > > ptp4l going into fault mode - so PHY PTP is proving to be way more
> > > unreliable than mvpp2 PTP. :(  
> > 
> > That's really weird. On my board the Marvell PHY PTP is more reliable than
> > MACB. Even by disabling the interrupt.
> > What is the state of the driver you are using?   
> 
> Right, it seems that some of the problems were using linuxptp v3.0
> rather than v4.4, which seems to work better (in that it doesn't
> seem to time out and drop into fault mode.)
> 
> With v4.4, if I try:
> 
> # ./ptp4l -i eth2 -m -s -2
> ptp4l[322.396]: selected /dev/ptp0 as PTP clock
> ptp4l[322.453]: port 1 (eth2): INITIALIZING to LISTENING on INIT_COMPLETE
> ptp4l[322.454]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on
> INIT_COMPLETE ptp4l[322.455]: port 0 (/var/run/ptp4lro): INITIALIZING to
> LISTENING on INIT_COMPLETE ptp4l[328.797]: selected local clock
> 005182.fffe.113302 as best master
> 
> that's all I see. If I drop the -2, then:

It seems you are still using your Marvell PHY drivers without my change.
PTP L2 was broken on your first patch and I fixed it.
I have the same result without the -2 which mean ptp4l uses UDP IPV4.
 
> and from that you can see that the offset and frequency are very much
> all over the place, not what you would expect from something that is
> supposed to be _hardware_ timestamped - which is why I say that NTP
> seems to be superior to PTP at least here.
> 
> With mvpp2, it's a very similar story:

> ptp4l[628.834]: master offset      38211 s2 freq  -29874 path delay     62949
> ptp4l[629.834]: master offset     -41111 s2 freq  -97733 path delay     66289
> ptp4l[630.834]: master offset      33131 s2 freq  -35824 path delay     63864
> ptp4l[631.834]: master offset     -55578 s2 freq -114594 path delay     63864
> ptp4l[632.833]: master offset      34110 s2 freq  -41579 path delay     57582
> ptp4l[633.834]: master offset     -13137 s2 freq  -78593 path delay     60047
> ptp4l[634.834]: master offset      55063 s2 freq  -14334 path delay     49425
> ptp4l[635.834]: master offset     -41302 s2 freq  -94180 path delay     49425

I can't tell about mvpp2 as I don't have board with this MAC but these values
seem really high and vary a lot. As this behavior is similar between the Marvell
PHY or the mvpp2 MAC maybe the issue comes indeed from your link partner. 

> Again, offset all over the place, frequency also showing that it doesn't
> stabilise.
> 
> This _could_ be because of the master clock being random - but then it's
> using the FEC PTP implementation with PTPD v2 - maybe either the FEC
> implementation is buggy or maybe it's PTPD v2 causing this. I have no
> idea how I can debug this - and I'm not going to invest in a "grand
> master" PTP clock on a whim just to find out that isn't the problem.
> 
> I thought... maybe I can use the PTP implementation in a Marvell
> switch as the network master, but the 88E6176 doesn't support PTP.
> 
> Maybe I can use an x86 platform... nope:
> 
> # ethtool -T enp0s25
> Time stamping parameters for enp0s25:
> Capabilities:
>         software-transmit
>         software-receive
>         software-system-clock

Still you could try with timestamping from software on the link partner.
On my side I am using a STM32MP157-DK as link partner.

If I set the DK board as PTP master and tell it to use software PTP (-S
parameter) it is still more reliable than yours.
ptp4l[4419.134]: master offset        136 s2 freq   -1984 path delay    118390
ptp4l[4420.134]: master offset       1757 s2 freq    -322 path delay    115888
ptp4l[4421.134]: master offset      -1154 s2 freq   -2706 path delay    115888
ptp4l[4422.134]: master offset      -1652 s2 freq   -3551 path delay    115888
ptp4l[4423.134]: master offset      -1199 s2 freq   -3593 path delay    115252

> PTP Hardware Clock: none
> Hardware Transmit Timestamp Modes: none
> Hardware Receive Filter Modes: none
> 
> Anyway, let's try taking a tcpdump on the x86 machine of the sync
> packets and compare the deviation of the software timestamp to that
> of the hardware timestamp (all deviations relative to the first
> packet part seconds):
> 
> 16:30:30.577298 - originTimeStamp : 1744299061 seconds, 762464622 nanoseconds
> 16:30:31.577270 - originTimeStamp : 1744299062 seconds, 762363987 nanoseconds
>    -28us						-100.635us
> 16:30:32.577303 - originTimeStamp : 1744299063 seconds, 762429696 nanoseconds
>    +85us						-34.926us
> 16:30:33.577236 - originTimeStamp : 1744299064 seconds, 762328728 nanoseconds
>    -62us						-135.894us
> 16:30:34.577280 - originTimeStamp : 1744299065 seconds, 762398770 nanoseconds
>    -18us						-65.852us
> 
> We can see here that the timestamp from the software receive is far
> more regular than the origin timestamp in the packets, which, in
> combination with the randomness of both mvpp2 and the 88e151x PTP
> trying to sync with it, makes me question whether there is something
> fundamentally wrong with the FEC PTP implementation / PTPDv2.

So we come to the same conclusion, the issue comes from your link partner! ;)

Regards,
-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ