netdev - Re: [PATCH net-next v2 06/12] net-timestamp: introduce TS_SCHED_OPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAL+tcoD8OF0LCSFVEN-oEQas1JGfR+HF7Zt+2fqMH5_4eK9X4g@mail.gmail.com>
Date: Wed, 16 Oct 2024 09:24:05 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com, 
	willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org, 
	eddyz87@...il.com, song@...nel.org, yonghong.song@...ux.dev, 
	john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me, 
	haoluo@...gle.com, jolsa@...nel.org, bpf@...r.kernel.org, 
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v2 06/12] net-timestamp: introduce
 TS_SCHED_OPT_CB to generate dev xmit timestamp

On Wed, Oct 16, 2024 at 9:01 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> On 10/11/24 9:06 PM, Jason Xing wrote:
> > From: Jason Xing <kernelxing@...cent.com>
> >
> > Introduce BPF_SOCK_OPS_TS_SCHED_OPT_CB flag so that we can decide to
> > print timestamps when the skb just passes the dev layer.
> >
> > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > ---
> >   include/uapi/linux/bpf.h       |  5 +++++
> >   net/core/skbuff.c              | 17 +++++++++++++++--
> >   tools/include/uapi/linux/bpf.h |  5 +++++
> >   3 files changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 157e139ed6fc..3cf3c9c896c7 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -7019,6 +7019,11 @@ enum {
> >                                        * by the kernel or the
> >                                        * earlier bpf-progs.
> >                                        */
> > +     BPF_SOCK_OPS_TS_SCHED_OPT_CB,   /* Called when skb is passing through
> > +                                      * dev layer when SO_TIMESTAMPING
> > +                                      * feature is on. It indicates the
> > +                                      * recorded timestamp.
> > +                                      */
> >   };
> >
> >   /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 3a4110d0f983..16e7bdc1eacb 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -5632,8 +5632,21 @@ static void bpf_skb_tstamp_tx_output(struct sock *sk, int tstype)
> >               return;
> >
> >       tp = tcp_sk(sk);
> > -     if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG))
> > -             return;
> > +     if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG)) {
> > +             struct timespec64 tstamp;
> > +             u32 cb_flag;
> > +
> > +             switch (tstype) {
> > +             case SCM_TSTAMP_SCHED:
> > +                     cb_flag = BPF_SOCK_OPS_TS_SCHED_OPT_CB;
> > +                     break;
> > +             default:
> > +                     return;
> > +             }
> > +
> > +             tstamp = ktime_to_timespec64(ktime_get_real());
> > +             tcp_call_bpf_2arg(sk, cb_flag, tstamp.tv_sec, tstamp.tv_nsec);
>
> There is bpf_ktime_get_*() helper. The bpf prog can directly call the
> bpf_ktime_get_* helper and use whatever clock it sees fit instead of enforcing
> real clock here and doing an extra ktime_to_timespec64. Right now the
> bpf_ktime_get_*() does not have real clock which I think it can be added.

In this way, there is no need to add tcp_call_bpf_*arg() to pass
timestamp to userspace, right? Let the bpf program implement it.

Now I wonder what information I should pass? Sorry for the lack of BPF
related knowledge :(

>
> I think overall the tstamp reporting interface does not necessarily have to
> follow the socket API. The bpf prog is running in the kernel. It could pass
> other information to the bpf prog if it sees fit. e.g. the bpf prog could also
> get the original transmitted tcp skb if it is useful.

Good to know that! But how the BPF program parses the skb by using
tcp_call_bpf_2arg() which only passes u32 parameters.

Thanks,
Jason