lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoA7Efzxg9c-CBn3S0JEQZLUHBaCA+dL=mgWbVh26SukgA@mail.gmail.com>
Date: Wed, 5 Feb 2025 02:09:20 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Martin KaFai Lau <martin.lau@...ux.dev>, davem@...emloft.net, edumazet@...gle.com, 
	kuba@...nel.org, pabeni@...hat.com, dsahern@...nel.org, willemb@...gle.com, 
	ast@...nel.org, daniel@...earbox.net, andrii@...nel.org, eddyz87@...il.com, 
	song@...nel.org, yonghong.song@...ux.dev, john.fastabend@...il.com, 
	kpsingh@...nel.org, sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, 
	horms@...nel.org, bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v7 11/13] net-timestamp: add a new callback in tcp_tx_timestamp()

On Wed, Feb 5, 2025 at 1:08 AM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> Jason Xing wrote:
> > On Tue, Feb 4, 2025 at 9:16 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> > >
> > > On 1/28/25 12:46 AM, Jason Xing wrote:
> > > > Introduce the callback to correlate tcp_sendmsg timestamp with other
> > > > points, like SND/SW/ACK. We can let bpf trace the beginning of
> > > > tcp_sendmsg_locked() and fetch the socket addr, so that in
> > >
> > > Instead of "fetch the socket addr...", should be "store the sendmsg timestamp at
> > > the bpf_sk_storage ...".
> >
> > I will revise it. Thanks.
> >
> > >
> > > > tcp_tx_timestamp() we can correlate the tskey with the socket addr.
> > >
> > >
> > > > It is accurate since they are under the protect of socket lock.
> > > > More details can be found in the selftest.
> > >
> > > The selftest uses the bpf_sk_storage to store the sendmsg timestamp at
> > > fentry/tcp_sendmsg_locked and retrieves it back at tcp_tx_timestamp (i.e.
> > > BPF_SOCK_OPS_TS_SND_CB added in this patch).
> > >
> > > >
> > > > Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
> > > > ---
> > > >   include/uapi/linux/bpf.h       | 7 +++++++
> > > >   net/ipv4/tcp.c                 | 1 +
> > > >   tools/include/uapi/linux/bpf.h | 7 +++++++
> > > >   3 files changed, 15 insertions(+)
> > > >
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index 800122a8abe5..accb3b314fff 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -7052,6 +7052,13 @@ enum {
> > > >                                        * when SK_BPF_CB_TX_TIMESTAMPING
> > > >                                        * feature is on.
> > > >                                        */
> > > > +     BPF_SOCK_OPS_TS_SND_CB,         /* Called when every sendmsg syscall
> > > > +                                      * is triggered. For TCP, it stays
> > > > +                                      * in the last send process to
> > > > +                                      * correlate with tcp_sendmsg timestamp
> > > > +                                      * with other timestamping callbacks,
> > > > +                                      * like SND/SW/ACK.
> > >
> > > Do you have a chance to look at how this will work at UDP?
> >
> > Sure, I feel like it could not be useful for UDP. Well, things get
> > strange because I did write a long paragraph about this thing which
> > apparently disappeared...
> >
> > I manage to find what I wrote:
> >     For UDP type, BPF_SOCK_OPS_TS_SND_CB may be not suitable because
> >     there are two sending process, 1) lockless path, 2) lock path, which
> >     should be handled carefully later. For the former, even though it's
> >     unlikely multiple threads access the socket to call sendmsg at the
> >     same time, I think we'd better not correlate it like what we do to the
> >     TCP case because of the lack of sock lock protection. Considering SND_CB is
> >     uapi flag, I think we don't need to forcely add the 'TCP_' prefix in
> >     case we need to use it someday.
> >
> >     And one more thing is I'd like to use the v5[1] method in the next round
> >     to introduce a new tskey_bpf which is good for UDP type. The new field
> >     will not conflict with the tskey in shared info which is generated
> >     by sk->sk_tskey in __ip_append_data(). It hardly works if both features
> >     (so_timestamping and its bpf extension) exists at the same time. Users
> >     could get confused because sometimes they fetch the tskey from skb,
> >     sometimes they don't, especially when we have cmsg feature to turn it on/
> >     off per sendmsg. A standalone tskey for bpf extension will be needed.
> >     With this tskey_bpf, we can easily correlate the timestamp in sendmsg
> >     syscall with other tx points(SND/SW/ACK...).
> >
> >     [1]: https://lore.kernel.org/all/20250112113748.73504-14-kerneljasonxing@gmail.com/
> >
> >     If possible, we can leave this question until the UDP support series
> >     shows up. I will figure out a better solution :)
> >
> > In conclusion, it probably won't be used by the UDP type. It's uAPI
> > flag so I consider the compatibility reason.
>
> I don't think this is acceptable. We should aim for an API that can
> easily be used across protocols, like SO_TIMESTAMPING.

After I revisit the UDP SO_TIMESTAMPING again, my thoughts are
adjusted like below:

It's hard to provide an absolutely uniform interface or usage to users
for TCP and UDP and even more protocols. Cases can be handled one by
one. The main obstacle is how we can correlate the timestamp in
sendmsg syscall with other sending timestamps. It's worth noticing
that for SO_TIMESTAMPING the sendmsg timestamp is collected in the
userspace. For instance, while skb enters the qdisc, we fail to know
which skb belongs to which sendmsg.

An idea coming up is to introduce BPF_SOCK_OPS_TS_SND_CB to correlate
the sendmsg timestamp with tskey (in tcp_tx_timestamp()) under the
protection of socket lock + syscall as the current patch does. But for
UDP, it can be lockless. IIUC, there is a very special case where even
SO_TIMESTAMPING may get lost: if multiple threads accessing the same
socket send UDP packets in parallel, then users could be confused
which tskey matches which sendmsg. IIUC, I will not consider this
unlikely case, then the UDP case is quite similar to the TCP case.

The scenario for the UDP case is:
1) using fentry bpf to hook the udp_sendmsg() to get the timestamp
like TCP does in this series.
2) insert BPF_SOCK_OPS_TS_SND_CB into __ip_append_data() near the
SO_TIMESTAMPING code snippets to let bpf program correlate the tskey
with timestamp.
Note: tskey in UDP will be handled carefully in a different way
because we should support both modes for socket timestamping at the
same time.
It's really similar to TCP regardless of handling tskey.

I feel like I have to change back the name to BPF_SOCK_OPS_TS_TCP_SND_CB?

Thanks,
Jason

> Taking a
> timestamp at sendmsg entry is a useful property for all such
> protocols.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ