[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ade362f178297751e8a0846e0342d5086623edc.camel@iki.fi>
Date: Tue, 14 May 2024 19:19:45 +0300
From: Pauli Virtanen <pav@....fi>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>, Luiz Augusto von
Dentz <luiz.dentz@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net,
linux-bluetooth@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: pull request: bluetooth-next 2024-05-10
Hi all,
Thanks for the comments, I guess the original series should have been
Cc'd more widely to get them earlier.
ma, 2024-05-13 kello 22:09 -0400, Willem de Bruijn kirjoitti:
> Luiz Augusto von Dentz wrote:
> > Hi Willem,
> >
> > On Mon, May 13, 2024 at 9:32 PM Willem de Bruijn
> > <willemdebruijn.kernel@...il.com> wrote:
> > >
> > > Jakub Kicinski wrote:
> > > > On Mon, 13 May 2024 18:09:31 -0400 Luiz Augusto von Dentz wrote:
> > > > > > There is one more warning in the Intel driver:
> > > > > >
> > > > > > drivers/bluetooth/btintel_pcie.c:673:33: warning: symbol 'causes_list'
> > > > > > was not declared. Should it be static?
> > > > >
> > > > > We have a fix for that but I was hoping to have it in before the merge
> > > > > window and then have the fix merged later.
> > > > >
> > > > > > It'd also be great to get an ACK from someone familiar with the socket
> > > > > > time stamping (Willem?) I'm not sure there's sufficient detail in the
> > > > > > commit message to explain the choices to:
> > > > > > - change the definition of SCHED / SEND to mean queued / completed,
> > > > > > while for Ethernet they mean queued to qdisc, queued to HW.
> > > > >
> > > > > hmm I thought this was hardware specific, it obviously won't work
> > > > > exactly as Ethernet since it is a completely different protocol stack,
> > > > > or are you suggesting we need other definitions for things like TX
> > > > > completed?
> > > >
> > > > I don't know anything about queuing in BT, in terms of timestamping
> > > > the SEND - SCHED difference is supposed to indicate the level of
> > > > host delay or host congestion. If the queuing in BT happens mostly in
> > > > the device HW queue then it may make sense to generate SCHED when
> > > > handing over to the driver. OTOH if the devices can coalesce or delay
> > > > completions the completion timeout may be less accurate than stamping
> > > > before submitting to HW... I'm looking for the analysis that the choices
> > > > were well thought thru.
> > >
> > > SCM_TSTAMP_SND is taken before an skb is passed to the device.
> > > This matches request SOF_TIMESTAMPING_TX_SOFTWARE.
> > >
> > > A timestamp returned on transmit completion is requested as
> > > SOF_TIMESTAMPING_TX_HARDWARE. We do not have a type for a software
> > > timestamp taken at tx completion cleaning. If anything, I would think
> > > it would be a passes as a hardware timestamp.
> >
> > In that case I think we probably misinterpret it, at least I though
> > that TX_HARDWARE would really be a hardware generated timestamp using
> > it own clock
>
> It normally is. It is just read from the tx descriptor on completion.
>
> We really don't have a good model for a software timestamp taken at
> completion processing.
>
> It may be worthwhile more broadly, especially for devices that do not
> support true hardware timestamps.
>
> Perhaps we should add an SCM_TSTAMP_TXCOMPLETION for this case. And a
> new SOF_TIMESTAMPING option to go with it. Similar to what we did for
> SCM_STAMP_SCHED.
Ok, I was also under the impression TX_HARDWARE was only for actual HW
timestamps. TSTAMP_ACK appeared to not really match semantics either,
so TSTAMP_SND it then was.
The general timestamping flow here was:
sendmsg() from user generates skbs to net/bluetooth side queue
|
* wait in net/bluetooth side queue until HW has free packet slot
|
* send to driver (-> SCM_TSTAMP_SCHED*)
|
* driver (usu. ASAP) queues to transport e.g. USB
|
* transport tx complete, skb freed
|
* packet waits in hardware-side buffers (usu. the largest delay)
|
* packet completion report from HW (-> SCM_TSTAMP_SND*)
|
* for one packet type, HW timestamp for last tx packet can queried
The packet completion report does not imply the packet was received.
>From the above, I gather SCHED* should be SND, and SND* should be
TXCOMPLETION. Then I'm not sure when we should generate SCHED, if at
all, unless it's done more or less in sendmsg() when it generates the
skbs.
Possibly the SND timestamp could also be generated on driver side if
one wants to have it taken at transport tx completion. I don't
immediately know what is the use case would be though, as the packet
may still have to wait on HW side before it goes over the air.
For the use case here, we want to know the total latency, so the
completion timestamp is the interesting one. In the audio use case, in
normal operation there is a free HW slot and packets do not wait in
net/bluetooth queues but end up in HW buffers ASAP (fast, maybe < 1
ms), and then wait a much longer time (usu. 5-50 ms) in the HW buffers
before it reports completion.
> > if you are saying that TX_HARDWARE is just marking the
> > TX completion of the packet at the host then we can definitely align
> > with the current exception, that said we do have a command to actually
> > read out the actual timestamp from the BT controller, that is usually
> > more precise since some of the connection do require usec precision
> > which is something that can get skew by the processing of HCI events
> > themselves, well I guess we use that if the controller supports it and
> > if it doesn't then we do based on the host timestamp when processing
> > the HCI event indicating the completion of the transmission.
> >
> > > Returning SCHED when queuing to a device and SND later on receiving
> > > completions seems like not following SO_TIMESTAMPING convention to me.
> > > But I don't fully know the HCI model.
> > >
> > > As for the "experimental" BT_POLL_ERRQUEUE. This is an addition to the
> > > ABI, right? So immutable. Is it fair to call that experimental?
> >
> > I guess you are referring to the fact that sockopt ID reserved to
> > BT_POLL_ERRQUEUE cannot be reused anymore even if we drop its usage in
> > the future, yes that is correct, but we can actually return
> > ENOPROTOOPT as it current does:
> >
> > if (!bt_poll_errqueue_enabled())
> > return -ENOPROTOOPT
>
> I see. Once applications rely on a feature, it can be hard to actually
> deprecate. But in this case it may be possible.
>
> > Anyway I would be really happy to drop it so we don't have to worry
> > about it later.
> >
> > > It might be safer to only suppress the sk_error_report in
> > > sock_queue_err_skb. Or at least in bt_sock_poll to check the type of
> > > all outstanding errors and only suppress if all are timestamps.
> >
> > Or perhaps we could actually do that via poll/epoll directly? Not that
> > it would make it much simpler since the library tends to wrap the
> > usage of poll/epoll but POLLERR meaning both errors or errqueue events
> > is sort of the problem we are trying to figure out how to process them
> > separately.
>
> The process would still be awoken, of course. If bluetoothd can just
> be modified to ignore the reports, that would indeed be easiest from
> a kernel PoV.
This can be done on bluetoothd side, the ugly part is just the wakeup
on every TX timestamp, which is every ~10ms in these use cases if every
packet is stamped. EPOLLET probably would indeed avoid busy looping on
the same timestamp though.
In the first round of this patchset, this was handled on bluetoothd
side without kernel additions, with rate limiting the polling. If
POLL_ERRQUEUE sounds like a bad idea, maybe we can go back to that.
--
Pauli Virtanen
Powered by blists - more mailing lists