[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFD8VDUgRaZ3OZZd@pengutronix.de>
Date: Tue, 17 Jun 2025 07:25:40 +0200
From: Oleksij Rempel <o.rempel@...gutronix.de>
To: Lukasz Majewski <lukma@...x.de>
Cc: netdev@...r.kernel.org, Arun Ramadoss <arun.ramadoss@...rochip.com>,
Vladimir Oltean <olteanv@...il.com>, Tristram.Ha@...rochip.com,
Richard Cochran <richardcochran@...il.com>,
Christian Eggers <ceggers@...i.de>
Subject: Re: [PTP][KSZ9477][p2p1step] Questions for PTP support on KSZ9477
device
On Mon, Jun 16, 2025 at 05:25:01PM +0200, Lukasz Majewski wrote:
> Dear Community,
>
> As of [1] KSZ drivers support HW timestamping HWTSTAMP_TX_ONESTEP_P2P.
> When used with ptp4l (config [2]) I'm able to see that two boards with
> KSZ9477 can communicate and one of them is a grandmaster device.
>
> This is OK (/dev/ptp0 is created and works properly).
>
> From what I have understood - the device which supports p2p1step also
> supports "older" approaches, so communication with other HW shall be
> possible.
This is not fully correct. "One step" and "two step" need different things from
hardware and driver.
In "one step" mode, the switch modifies the PTP frame directly and inserts the
timestamp during sending (start of frame). This works without host help.
But for "two step" mode, the hardware only timestamps after the frame is sent.
The host must then read this timestamp. For that, the switch must trigger an
interrupt to the host. This requires:
- board to wire the IRQ line from switch to host,
- and driver to handle that interrupt and read the timestamp (like in
ksz_ptp_msg_thread_fn()).
So it's not only about switch HW. It also depends on board design and driver
support.
> Hence the questions:
>
> 1. Would it be possible to communicate with beaglebone black (BBB)
> connected to the same network?
No, this will not work correctly. Both sides must use the same timestamping
mode: either both "one step" or both "two step".
> root@...gleBone:~# ethtool -T eth0
> Hardware Transmit Timestamp Modes:
> off
> on
> Hardware Receive Filter Modes:
> none
> ptpv2-event
BBB supports only "two step" (mode "on").
> My board:
> # ethtool -T lan3
> Hardware Transmit Timestamp Modes:
> off
> onestep-p2p
> Hardware Receive Filter Modes:
> none
> ptpv2-l4-event
> ptpv2-l2-event
> ptpv2-event
Your board supports only "one step". So they cannot sync correctly.
> (other fields are the same)
>
> As I've stated above - onestep-p2p shall also support the "on" mode
> from BBB.
No, onestep-p2p cannot talk to "on" mode. They use different timestamping
logic. ptp4l cannot mix one-step and two-step.
> The documentation of KSZ9477 states that:
> - IEEE 1588v2 PTP and Clock Synchronization
> - Transparent Clock (TC) with auto correction update
> - Master and slave Ordinary Clock (OC) support
> - End-to-end (E2E) or peer-to-peer (P2P)
> - PTP multicast and unicast message support
> - PTP message transport over IPv4/v6 and IEEE 802.3
> - IEEE 1588v2 PTP packet filtering
> - Synchronous Ethernet support via recovered clock
>
> which looks like all PTP use cases (and other boards) for HW shall be
> supported.
>
> Is this a matter of not (yet) available in-driver support or do I need
> to configure linuxptp in different way to have such support?
Be careful - this list shows what the hardware could support, but not what
actually works in all setups. Many features need specific driver or board
support.
For example, KSZ9477 has an erratum for 2-step mode:
When 2-step is enabled, some PTP messages (like Sync, Follow-up, Announce) can
get dropped if normal traffic is present. This makes 2-step mode unreliable.
End-user impact: Protocols like gPTP or AVB, which require 2-step mode, will
not work correctly. The device cannot keep time sync in 2-step mode with other
devices.
> 2. The master clock synchronization and calibration
>
> On one board (grandmaster) (connected to lan3):
> [1943.558]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
> [1951.091]: port 1: LISTENING to MASTER on
> ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
> [1951.091]: selected local clock 824f12.fffe.110022 as best master
> [1951.091]: port 1: assuming the grand master role
>
> The other board:
> [890.003]: port 1 (lan3): new foreign master 824f12.fffe.110022-1
> [894.003]: selected best master clock 824f12.fffe.110022
> [894.005]: port 1 (lan3): LISTENING to UNCALIBRATED on RS_SLAVE
At this point, I would expect to see output like:
master offset ... path delay ...
port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
If these are missing, sync is not working. In this case, the likely reason is
that the two devices use different timestamping modes — one-step vs two-step —
which are not compatible. Because of that, delay and offset cannot be
calculated, and the state stays UNCALIBRATED.
> The phc2sys -m -s lan3 shows some calibration...
>
> CLOCK_REALTIME phc offset 65 s2 freq -45509 delay 351557
> CLOCK_REALTIME phc offset 591 s2 freq -44964 delay 350475
> CLOCK_REALTIME phc offset -892 s2 freq -46270 delay 350516
> CLOCK_REALTIME phc offset 137456 s2 freq +91811 delay 733784
> CLOCK_REALTIME phc offset -136987 s2 freq -141395 delay 350676
> CLOCK_REALTIME phc offset -41327 s2 freq -86831 delay 350216
> CLOCK_REALTIME phc offset 66 s2 freq -57837 delay 350489
> CLOCK_REALTIME phc offset 12037 s2 freq -45846 delay 351854
> CLOCK_REALTIME phc offset 12213 s2 freq -42059 delay 350474
> CLOCK_REALTIME phc offset 8984 s2 freq -41624 delay 349682
>
>
> but the "fluctuation" is too large to regard it as a "stable" and
> precise source.
This is not the right tool to check the current sync problem. The timestamp
readings on KSZ are done over several SPI transactions, which adds jitter and
delay.
For better accuracy, the driver should support gettimex64(), but this is not
implemented in the KSZ driver. So the phc2sys output is not reliable here.
> And probably hence it is "UNCALIBRATED" master clock.
>
> Even more strange - the tshark -i lan3 -Y "ptp" -V
>
> .... 1011 = messageId: Announce Message (0xb)
> correction: 0.000000 nanoseconds
> correction: Ns: 0 nanoseconds
> correctionSubNs: 0 nanoseconds
>
> .... 0010 = messageId: Peer_Delay_Req Message (0x2)
> correction: 0.000000 nanoseconds
> correction: Ns: 0 nanoseconds
> correctionSubNs: 0 nanoseconds
>
> shows always the correction value of 0 ns.
> I do guess that it shall have some (different) values.
>
> Any hints on fixing this problem?
The correctionField is only set when Transparent Clock (TC) is active. But with
KSZ switches, TC is disabled as soon as any port uses DSA CPU tagging.
So in your setup, TC is not active — that’s why correctionField stays 0. This
is expected behavior with current driver and DSA integration
> 3. Just to mention - I've found rather old conversation regarding PTP
> support [3] on KSZ devices (but for KSZ9563)
>
> And it looks like it has already been adopted to minline Linux.
> Am I correct? Or is anything still missing (and hence I do see the two
> described above issues)?
Some support is in mainline, but what works depends on several things:
- the exact KSZ switch variant (some have quirks, e.g. broken 2-step mode),
- the required feature (OC, TC, 1-step or 2-step),
- the board implementation (e.g. IRQ lines connected or not),
- and driver support (e.g. timestamp reading, gettimex64).
For example:
- If your KSZ chip has broken 2-step mode (known issue), you can only use
1-step.
- If the switch is used with DSA and CPU tagging is enabled, Transparent
Clock cannot work.
So yes, it’s upstream - but real support depends on your exact use case.
Best Regards,
Oleksij Rempel
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Powered by blists - more mailing lists