netdev - Re: [PATCH net-next v2 00/12] net-timestamp: bpf extension to equip applications transparently

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoA-pMZniF2wmYJBR+PKCWThT+i+h5K-cRs3P5yqe3x-PQ@mail.gmail.com>
Date: Tue, 15 Oct 2024 10:52:07 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, dsahern@...nel.org, willemb@...gle.com, ast@...nel.org, 
	daniel@...earbox.net, andrii@...nel.org, martin.lau@...ux.dev, 
	eddyz87@...il.com, song@...nel.org, yonghong.song@...ux.dev, 
	john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me, 
	haoluo@...gle.com, jolsa@...nel.org, bpf@...r.kernel.org, 
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v2 00/12] net-timestamp: bpf extension to equip
 applications transparently

On Tue, Oct 15, 2024 at 9:28 AM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> Jason Xing wrote:
> > On Sun, Oct 13, 2024 at 1:48 AM Willem de Bruijn
> > <willemdebruijn.kernel@...il.com> wrote:
> > >
> > > Jason Xing wrote:
> > > > From: Jason Xing <kernelxing@...cent.com>
> > > >
> > > > A few weeks ago, I planned to extend SO_TIMESTMAMPING feature by using
> > > > tracepoint to print information (say, tstamp) so that we can
> > > > transparently equip applications with this feature and require no
> > > > modification in user side.
> > > >
> > > > Later, we discussed at netconf and agreed that we can use bpf for better
> > > > extension, which is mainly suggested by John Fastabend and Willem de
> > > > Bruijn. Many thanks here! So I post this series to see if we have a
> > > > better solution to extend. My feeling is BPF is a good place to provide
> > > > a way to add timestamping by administrators, without having to rebuild
> > > > applications.
> > > >
> > > > This approach mostly relies on existing SO_TIMESTAMPING feature, users
> > > > only needs to pass certain flags through bpf_setsocktop() to a separate
> > > > tsflags. For TX timestamps, they will be printed during generation
> > > > phase. For RX timestamps, we will wait for the moment when recvmsg() is
> > > > called.
> > > >
> > > > After this series, we could step by step implement more advanced
> > > > functions/flags already in SO_TIMESTAMPING feature for bpf extension.
> > > >
> > > > In this series, I only support TCP protocol which is widely used in
> > > > SO_TIMESTAMPING feature.
> > > >
> > > > ---
> > > > V2
> > > > Link: https://lore.kernel.org/all/20241008095109.99918-1-kerneljasonxing@gmail.com/
> > > > 1. Introduce tsflag requestors so that we are able to extend more in the
> > > > future. Besides, it enables TX flags for bpf extension feature separately
> > > > without breaking users. It is suggested by Vadim Fedorenko.
> > > > 2. introduce a static key to control the whole feature. (Willem)
> > > > 3. Open the gate of bpf_setsockopt for the SO_TIMESTAMPING feature in
> > > > some TX/RX cases, not all the cases.
> > > >
> > > > Note:
> > > > The main concern we've discussion in V1 thread is how to deal with the
> > > > applications using SO_TIMESTAMPING feature? In this series, I allow both
> > > > cases to happen at the same time, which indicates that even one
> > > > applications setting SO_TIMESTAMPING can still be traced through BPF
> > > > program. Please see patch [04/12].
> > >
> > > This revision does not address the main concern.
> > >
> > > An administrator installed BPF program can affect results of a process
> > > using SO_TIMESTAMPING in ways that break it.
> >
> > Sorry, I didn't get it. How the following code snippet would break users?
>
> The state between user and bpf timestamping needs to be separate to
> avoid interference.

Do you agree that we will use this method as following, only allow
either of them to work?

void __skb_tstamp_tx(struct sk_buff *orig_skb,
                     const struct sk_buff *ack_skb,
                     struct skb_shared_hwtstamps *hwtstamps,
                     struct sock *sk, int tstype)
{
        if (!sk)
                return;

       ret = skb_tstamp_tx_output(orig_skb, ack_skb, hwtstamps, sk, tstype);
       if (ret)
               /* Apps does set the SO_TIMESTAMPING flag, return directly */
               return;

       if (static_branch_unlikely(&bpf_tstamp_control))
                bpf_skb_tstamp_tx_output(sk, orig_skb, tstype, hwtstamps);
}

which means if the apps using non-bpf method, we will not see the
output even if we load bpf program.

>
> Introducing a new sk_tsflags for bpf goes a long way. Though I prefer
> a separate sk_tsflags_bpf and not touching existing sk_tsflags over
> the array approach of patch 1. Also need to check pahole and maybe
> move sk_tsflags_bpf elsewhere in the struct.

Yes, I will use this instead.

>
> Other state is sk_tskey. The current approach can initialize the key
> in bpf before the user attempts it for the same socket. Admittedly
> unlikely. But hard to reach states creates hard to debug issues.
>
> This field cannot easily be duplicated, because the key is tracked
> in skb_shinfo. Where there is not sufficient room for two keys.
>
> The same goes for txflags.

They are not that easy to handle in a proper way. That's the reason
why I chose to use the same logic, so that there is no side effect.

If we expect to separate them as well, it seems a little bit weird to
introduce another similar flags in struct sk_buff.

>
> The current approach is to set those flags if either user or bpf
> requestss them, then on __skb_tstamp_tx detect if the user did not set
> them, and if so skip output to the user. Need to take a closer look,
> but seems to work.

Let me keep this current approach, it will not affect each other.

>
> So getting closer.

Thanks for the careful review.

Thanks,
Jason