[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b332beac0f22636e7877c681b3adb9d6ff70cde3.camel@alumni.tu-berlin.de>
Date: Thu, 18 Jan 2024 15:53:17 +0100
From: Jörn-Thorben Hinz <j-t.hinz@...mni.tu-berlin.de>
To: Martin KaFai Lau <martin.lau@...ux.dev>, Willem de Bruijn
<willemdebruijn.kernel@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
<daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, "David S.
Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub
Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Shuah Khan
<shuah@...nel.org>, Arnd Bergmann <arnd@...db.de>, Deepa Dinamani
<deepa.kernel@...il.com>, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
linux-kselftest@...r.kernel.org
Subject: Re: [PATCH bpf-next] bpf: Allow setting SO_TIMESTAMPING* with
bpf_setsockopt()
Hmm, after taking a new look at it today, I think my patch can be
disregarded---at least for having a BPF program access *RX* *hardware*
timestamps. (Sorry about the noise then.)
When I looked into this a few months ago, I half-blindly followed
Documentation/networking/timestamping.rst, afterwards assuming
bpf_setsockopt(SO_TIMESTAMPING*) will be necessary for my use case (see
about it at the end).
Looking at it again today, it seems the ioctl(SIOCSHWTSTAMP) is
sufficient here: It enables the hardware timestamping on the device,
which are placed in skb's/skb_shared_info's hwtstamps field. This
hwtstamps is where the values of __sk_buff.hwtstamp and
bpf_sock_ops.skb_hwtstamp are coming from. No further timestamp
processing is involved when a BPF program reads the these two fields.
Meaning bpf_setsockopt(SOF_TIMESTAMPING_RX_HARDWARE) would be a no-op
from the view of a BPF program.
I started this message before coming to the above understanding but
I've left my replies in below.
With bpf_setsockopt(SOF_TIMESTAMPING_RX_HARDWARE) being unnecessary,
and bpf_setsockopt(SOF_TIMESTAMPING_RX_SOFTWARE), as I understand,
having a number of possibly unwanted implications---should we leave it
at that here?
On Wed, 2024-01-17 at 13:23 -0800, Martin KaFai Lau wrote:
> > On 1/17/24 7:55 AM, Willem de Bruijn wrote:
> > > > Martin KaFai Lau wrote:
> > > > > > On 1/16/24 7:17 AM, Willem de Bruijn wrote:
> > > > > > > > Jörn-Thorben Hinz wrote:
> > > > > > > > > > A BPF application, e.g., a TCP congestion control,
> > > > > > > > > > might
> > > > > > > > > > benefit from or
> > > > > > > > > > even require precise (=hardware) packet timestamps.
> > > > > > > > > > These
> > > > > > > > > > timestamps are
> > > > > > > > > > already available through __sk_buff.hwtstamp and
> > > > > > > > > > bpf_sock_ops.skb_hwtstamp, but could not be
> > > > > > > > > > requested: BPF
> > > > > > > > > > programs were
> > > > > > > > > > not allowed to set SO_TIMESTAMPING* on sockets.
> > > > > >
> > > > > > This patch only uses the SOF_TIMESTAMPING_RX_HARDWARE in
> > > > > > the
> > > > > > selftest. How about
> > > > > > others? e.g. the SOF_TIMESTAMPING_TX_* that will affect the
> > > > > > sk->sk_error_queue
> > > > > > which seems not good. If rx tstamp is useful, tx tstamp
> > > > > > should be
> > > > > > useful also?
I admit I only ever looked at enabling and using
SOF_TIMESTAMPING_RX_HARDWARE for my/our use case. With that, I was not
aware that _SOFTWARE has more, possibly complicating implications.
> > > >
> > > > Good point. Or should not be allowed to be set from BPF.
> > > >
> > > > That significantly changes process behavior, e.g., by returning
> > > > POLLERR.
> > > >
> > > > > > > > > >
> > > > > > > > > > Enable BPF programs to actively request the
> > > > > > > > > > generation of
> > > > > > > > > > timestamps
> > > > > > > > > > from a stream socket. The also required
> > > > > > > > > > ioctl(SIOCSHWTSTAMP)
> > > > > > > > > > on the
> > > > > > > > > > network device must still be done separately, in
> > > > > > > > > > user space.
> > > > > >
> > > > > > hmm... so both ioctl(SIOCSHWTSTAMP) of the netdevice and
> > > > > > the
> > > > > > SOF_TIMESTAMPING_RX_HARDWARE of the sk must be done?
> > > > > >
> > > > > > I likely miss something. When skb is created in the driver
> > > > > > rx
> > > > > > path, the sk is
> > > > > > not known yet though. How the SOF_TIMESTAMPING_RX_HARDWARE
> > > > > > of the
> > > > > > sk affects the
> > > > > > skb_shinfo(skb)->hwtstamps?
I mostly followed Documentation/networking/timestamping.rst (section 3)
to understand how the hardware timestamps are to be setup and used.
>From my understanding, the ioctl(SIOCSHWTSTAMP) makes a persistent
setting for the device/driver, independent of the lifetime of any
socket or skb.
I used a simplified program[1] when trying out this patch a few months
ago.
> > > >
> > > > Indeed it does not seem to do anything in the datapath.
> > > >
> > > > Requesting SOF_TIMESTAMPING_RX_SOFTWARE will call
> > > > net_enable_timestamp
> > > > to start timestamping packets.
> > > >
> > > > But SOF_TIMESTAMPING_RX_HARDWARE does not so thing.
> > > >
> > > > Drivers do use it in ethtool get_ts_info to signal hardware
> > > > capabilities. But those must be configured using the ioctl.
> > > >
> > > > It is there more for consistency with the other timestamp
> > > > recording
> > > > options, I suppose.
> > > >
> >
> > Thanks for the explanation on the
> > SOF_TIMESTAMPING_RX_{HARDWARE,SOFTWARE}.
> >
> > __sk_buff.hwtstamp should have the NIC rx timestamp then as long as
> > the NIC is
> > ioctl configured.
> >
> > Jorn, do you need RX_SOFTWARE? From looking at net_timestamp_set(),
> > any socket
> > requested RX_SOFTWARE should be enough to get a skb->tstamp for all
> > skbs. A
> > workaround is to manually create a socket and turn on RX_SOFTWARE.
No, my use case was only for the RX hardware timestamps, as close to
the packet reception time point as possible.
> >
> > It will still be nice to get proper bpf_setsockopt() support for
> > RX_SOFTWARE but
> > it should be considered together with how SO_TIMESTAMPING_TX_*
> > should
> > work in
> > bpf prog considering the TX tstamping does not have a workaround
> > solution like
> > RX_SOFTWARE.
> >
> > It is probably cleaner to have a separate bit in sk->sk_tsflags for
> > bpf such
> > that the bpf prog won't be affected by the userspace turning it
> > on/off and it
> > won't change the userspace's expectation also (e.g. sk_error_queue
> > and POLLERR).
> >
> > The part that needs more thoughts in the tx tstamp is how to notify
> > the bpf prog
> > to consume it. Potentially the kernel can involve a bpf prog to
> > collect the tx
> > timestamp when the bpf bit in sk->sk_tsflags is set. An example on
> > how TCP-CC is
> > using it will help to think of the approach here.
My (academic) application was an implementation[2,3] of PowerTCP[4], a
CC that (in its simplified variant) profits from precise timestamping.
Only the RX timestamps would be of use there.
As mentioned above, I used[1] a while ago when I looked into timestamp
usage. It shows how I imagine the timestamps could be accessed and used
(similarly implemented in [2]).
[1] https://github.com/jtdor/bpf_hwtstamps
[2] https://github.com/inet-tub/powertcp-linux
[3] https://schmiste.github.io/ebpf23.pdf
[4] https://schmiste.github.io/nsdi22powertcp.pdf
> >
> >
Powered by blists - more mailing lists