[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <67abfa65cce76_155892294dc@willemb.c.googlers.com.notmuch>
Date: Tue, 11 Feb 2025 20:33:25 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Luiz Augusto von Dentz <luiz.dentz@...il.com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Pauli Virtanen <pav@....fi>,
linux-bluetooth@...r.kernel.org,
netdev@...r.kernel.org,
davem@...emloft.net,
kuba@...nel.org
Subject: Re: [PATCH v3 1/5] net-timestamp: COMPLETION timestamp on packet tx
completion
Luiz Augusto von Dentz wrote:
> Hi Willem,
>
> On Mon, Feb 10, 2025 at 1:43 PM Willem de Bruijn
> <willemdebruijn.kernel@...il.com> wrote:
> >
> > Pauli Virtanen wrote:
> > > Hi,
> > >
> > > su, 2025-02-09 kello 22:29 -0500, Willem de Bruijn kirjoitti:
> > > > Pauli Virtanen wrote:
> > > > > Add SOF_TIMESTAMPING_TX_COMPLETION, for requesting a software timestamp
> > > > > when hardware reports a packet completed.
> > > > >
> > > > > Completion tstamp is useful for Bluetooth, where hardware tx timestamps
> > > > > cannot be obtained except for ISO packets, and the hardware has a queue
> > > > > where packets may wait. In this case the software SND timestamp only
> > > > > reflects the kernel-side part of the total latency (usually small) and
> > > > > queue length (usually 0 unless HW buffers congested), whereas the
> > > > > completion report time is more informative of the true latency.
> > > > >
> > > > > It may also be useful in other cases where HW TX timestamps cannot be
> > > > > obtained and user wants to estimate an upper bound to when the TX
> > > > > probably happened.
> > > >
> > > > Getting the completion timestamp may indeed be useful more broadly.
> > > >
> > > > Alternatively, the HW timestamp is relatively imprecisely defined so
> > > > you could even just use that. Ideally, a hw timestamp conforms to IEEE
> > > > 1588v2 PHY: first symbol on the wire IIRC. But in many cases this is
> > > > not the case. It is not feasible at line rate, or the timestamp is
> > > > only taken when the completion is written over PCI, which may be
> > > > subject to PCI backpressure and happen after transmission on the wire.
> > > > As a result, the worst case hw tstamp must already be assumed not much
> > > > earlier than a completion timestamp.
> > >
> > > For BT ISO packets, in theory hw-provided TX timestamps exist, and we
> > > might want both (with separate flags for enabling them). I don't really
> > > know, last I looked Intel HW didn't support them, and it's not clear to
> > > which degree they are useful.
> >
> > That's reason enough to separate these measurement types.
> >
> > If we don't do it properly now, we won't be able to update drivers
> > later once users depend on requesting hw timestamps when they mean to
> > get completion timestamps.
> >
> > > > That said, +1 on adding explicit well defined measurement point
> > > > instead.
> > > >
> > > > > Signed-off-by: Pauli Virtanen <pav@....fi>
> > > > > ---
> > > > > Documentation/networking/timestamping.rst | 9 +++++++++
> > > > > include/linux/skbuff.h | 6 +++++-
> > > > > include/uapi/linux/errqueue.h | 1 +
> > > > > include/uapi/linux/net_tstamp.h | 6 ++++--
> > > > > net/ethtool/common.c | 1 +
> > > > > net/socket.c | 3 +++
> > > > > 6 files changed, 23 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> > > > > index 61ef9da10e28..de2afed7a516 100644
> > > > > --- a/Documentation/networking/timestamping.rst
> > > > > +++ b/Documentation/networking/timestamping.rst
> > > > > @@ -140,6 +140,15 @@ SOF_TIMESTAMPING_TX_ACK:
> > > > > cumulative acknowledgment. The mechanism ignores SACK and FACK.
> > > > > This flag can be enabled via both socket options and control messages.
> > > > >
> > > > > +SOF_TIMESTAMPING_TX_COMPLETION:
> > > > > + Request tx timestamps on packet tx completion. The completion
> > > > > + timestamp is generated by the kernel when it receives packet a
> > > > > + completion report from the hardware. Hardware may report multiple
> > > > > + packets at once, and completion timestamps reflect the timing of the
> > > > > + report and not actual tx time. The completion timestamps are
> > > > > + currently implemented only for: Bluetooth L2CAP and ISO. This
> > > > > + flag can be enabled via both socket options and control messages.
> > > > > +
> > > >
> > > > Either we should support this uniformly, or it should be possible to
> > > > query whether a driver supports this.
> > > >
> > > > Unfortunately all completion callbacks are driver specific.
> > > >
> > > > But drivers that support hwtstamps will call skb_tstamp_tx with
> > > > nonzero hwtstamps. We could use that also to compute and queue
> > > > a completion timestamp if requested. At least for existing NIC
> > > > drivers.
> > >
> > > Ok. If possible, I'd like to avoid changing the behavior of the non-
> > > Bluetooth parts of net/ here, as I'm not familiar with those.
> > >
> > > I guess a simpler solution could be that sock_set_timestamping() checks
> > > the type of the socket, and gives EINVAL if the flag is set for non-
> > > Bluetooth sockets?
> >
> > Actually, I'd prefer to have this completion timestamp ability for all
> > drivers. And avoid creating subsystem private mechanisms.
> >
> > I suppose we can punt on the get_ts_info control API if need be.
>
> I guess that it is reasonable if we don't have to do the work for
> drivers other than Bluetooth otherwise I'd say you are probably asking
> too much here,
I was mainly agreeing to Pauli's implementation in this series.
Not asking to do any work irrelevant to BT.
> also doesn't this land on the TSN space if one needs to
> tightly control timings? I suspect if this sort of change was not
> necessary for TSN then perhaps it wouldn't be of much value to try to
> generalize this.
I don't fully follow. But in general SO_TIMESTAMPING is not limited to
TSN.
>
> > > One could then postpone having to invent how to check the driver
> > > support, and user would know non-supported status from setsockopt
> > > failing.
> > >
> > > > > 1.3.2 Timestamp Reporting
> > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > > > > index bb2b751d274a..3707c9075ae9 100644
> > > > > --- a/include/linux/skbuff.h
> > > > > +++ b/include/linux/skbuff.h
> > > > > @@ -489,10 +489,14 @@ enum {
> > > > >
> > > > > /* generate software time stamp when entering packet scheduling */
> > > > > SKBTX_SCHED_TSTAMP = 1 << 6,
> > > > > +
> > > > > + /* generate software time stamp on packet tx completion */
> > > > > + SKBTX_COMPLETION_TSTAMP = 1 << 7,
> > > > > };
> > > > >
> > > > > #define SKBTX_ANY_SW_TSTAMP (SKBTX_SW_TSTAMP | \
> > > > > - SKBTX_SCHED_TSTAMP)
> > > > > + SKBTX_SCHED_TSTAMP | \
> > > > > + SKBTX_COMPLETION_TSTAMP)
> > > >
> > > > These fields are used in the skb_shared_info tx_flags field.
> > > > Which is a very scarce resource. This takes the last available bit.
> > > > That is my only possible concern: the opportunity cost.
> > >
> > > If doing it per-protocol sounds ok, it could be put in bt_skb_cb
> > > instead.
> > >
> > > Since the completion timestamp didn't already exist, it maybe means
> > > it's probably not that important for other parts of net/
> >
> > I can see its value especially for hardware that does not support
> > hardware timestamps, or hw timestamps at line rate.
> >
> > This gives a reasonable estimation of transmission time and
> > measure of device delay.
> >
> > It is device specific whether it will be an over- or under-estimation,
> > depending on whether the completion is queued to the host after or
> > before the data is written on the wire. But either way, it will
> > include the delay in processing the tx queue, which on multi-queue
> > NICs and with TSO may be substantial (even before considering HW
> > rate limiting).
> >
> > > > > #define SKBTX_ANY_TSTAMP (SKBTX_HW_TSTAMP | \
> > > > > SKBTX_HW_TSTAMP_USE_CYCLES | \
> > > > > SKBTX_ANY_SW_TSTAMP)
> > > > > diff --git a/include/uapi/linux/errqueue.h b/include/uapi/linux/errqueue.h
> > > > > index 3c70e8ac14b8..1ea47309d772 100644
> > > > > --- a/include/uapi/linux/errqueue.h
> > > > > +++ b/include/uapi/linux/errqueue.h
> > > > > @@ -73,6 +73,7 @@ enum {
> > > > > SCM_TSTAMP_SND, /* driver passed skb to NIC, or HW */
> > > > > SCM_TSTAMP_SCHED, /* data entered the packet scheduler */
> > > > > SCM_TSTAMP_ACK, /* data acknowledged by peer */
> > > > > + SCM_TSTAMP_COMPLETION, /* packet tx completion */
> > > > > };
> > > > >
> > > > > #endif /* _UAPI_LINUX_ERRQUEUE_H */
> > > > > diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
> > > > > index 55b0ab51096c..383213de612a 100644
> > > > > --- a/include/uapi/linux/net_tstamp.h
> > > > > +++ b/include/uapi/linux/net_tstamp.h
> > > > > @@ -44,8 +44,9 @@ enum {
> > > > > SOF_TIMESTAMPING_BIND_PHC = (1 << 15),
> > > > > SOF_TIMESTAMPING_OPT_ID_TCP = (1 << 16),
> > > > > SOF_TIMESTAMPING_OPT_RX_FILTER = (1 << 17),
> > > > > + SOF_TIMESTAMPING_TX_COMPLETION = (1 << 18),
> > > > >
> > > > > - SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_RX_FILTER,
> > > > > + SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_TX_COMPLETION,
> > > > > SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
> > > > > SOF_TIMESTAMPING_LAST
> > > > > };
> > > > > @@ -58,7 +59,8 @@ enum {
> > > > > #define SOF_TIMESTAMPING_TX_RECORD_MASK (SOF_TIMESTAMPING_TX_HARDWARE | \
> > > > > SOF_TIMESTAMPING_TX_SOFTWARE | \
> > > > > SOF_TIMESTAMPING_TX_SCHED | \
> > > > > - SOF_TIMESTAMPING_TX_ACK)
> > > > > + SOF_TIMESTAMPING_TX_ACK | \
> > > > > + SOF_TIMESTAMPING_TX_COMPLETION)
> > > > >
> > > > > /**
> > > > > * struct so_timestamping - SO_TIMESTAMPING parameter
> > > > > diff --git a/net/ethtool/common.c b/net/ethtool/common.c
> > > > > index 2bd77c94f9f1..75e3b756012e 100644
> > > > > --- a/net/ethtool/common.c
> > > > > +++ b/net/ethtool/common.c
> > > > > @@ -431,6 +431,7 @@ const char sof_timestamping_names[][ETH_GSTRING_LEN] = {
> > > > > [const_ilog2(SOF_TIMESTAMPING_BIND_PHC)] = "bind-phc",
> > > > > [const_ilog2(SOF_TIMESTAMPING_OPT_ID_TCP)] = "option-id-tcp",
> > > > > [const_ilog2(SOF_TIMESTAMPING_OPT_RX_FILTER)] = "option-rx-filter",
> > > > > + [const_ilog2(SOF_TIMESTAMPING_TX_COMPLETION)] = "completion-transmit",
> > > >
> > > > just "tx-completion"?
> > >
> > > Ok.
> > >
> > > --
> > > Pauli Virtanen
> >
> >
>
>
> --
> Luiz Augusto von Dentz
Powered by blists - more mailing lists