[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20200615.134640.514632878913727284.davem@davemloft.net>
Date: Mon, 15 Jun 2020 13:46:40 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: olteanv@...il.com
Cc: f.fainelli@...il.com, vivien.didelot@...il.com, andrew@...n.ch,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
richardcochran@...il.com, vinicius.gomes@...el.com
Subject: Re: [PATCH net] net: dsa: sja1105: fix PTP timestamping with large
tc-taprio cycles
From: Vladimir Oltean <olteanv@...il.com>
Date: Sun, 14 Jun 2020 23:54:09 +0300
> From: Vladimir Oltean <vladimir.oltean@....com>
>
> It isn't actually described clearly at all in UM10944.pdf, but on TX of
> a management frame (such as PTP), this needs to happen:
>
> - The destination MAC address (i.e. 01-80-c2-00-00-0e), along with the
> desired destination port, need to be installed in one of the 4
> management slots of the switch, over SPI.
> - The host can poll over SPI for that management slot's ENFPORT field.
> That gets unset when the switch has matched the slot to the frame.
>
> And therein lies the problem. ENFPORT does not mean that the packet has
> been transmitted. Just that it has been received over the CPU port, and
> that the mgmt slot is yet again available.
>
> This is relevant because of what we are doing in sja1105_ptp_txtstamp_skb,
> which is called right after sja1105_mgmt_xmit. We are in a hard
> real-time deadline, since the hardware only gives us 24 bits of TX
> timestamp, so we need to read the full PTP clock to reconstruct it.
> Because we're in a hurry (in an attempt to make sure that we have a full
> 64-bit PTP time which is as close as possible to the actual transmission
> time of the frame, to avoid 24-bit wraparounds), first we read the PTP
> clock, then we poll for the TX timestamp to become available.
>
> But of course, we don't know for sure that the frame has been
> transmitted when we read the full PTP clock. We had assumed that ENFPORT
> means it has, but the assumption is incorrect. And while in most
> real-life scenarios this has never been caught due to software delays,
> nowhere is this fact more obvious than with a tc-taprio offload, where
> PTP traffic gets a small timeslot very rarely (example: 1 packet per 10
> ms). In that case, we will be reading the PTP clock for timestamp
> reconstruction too early (before the packet has been transmitted), and
> this renders the reconstruction procedure incorrect (see the assumptions
> described in the comments found on function sja1105_tstamp_reconstruct).
> So the PTP TX timestamps will be off by 1<<24 clock ticks, or 135 ms
> (1 tick is 8 ns).
>
> So fix this case of premature optimization by simply reordering the
> sja1105_ptpegr_ts_poll and the sja1105_ptpclkval_read function calls. It
> turns out that in practice, the 135 ms hard deadline for PTP timestamp
> wraparound is not so hard, since even the most bandwidth-intensive PTP
> profiles, such as 802.1AS-2011, have a sync frame interval of 125 ms.
> So if we couldn't deliver a timestamp in 135 ms (which we can), we're
> toast and have much bigger problems anyway.
>
> Fixes: 47ed985e97f5 ("net: dsa: sja1105: Add logic for TX timestamping")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
Applied and queued up for -stable, thank you.
Powered by blists - more mailing lists