[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoD8OF0LCSFVEN-oEQas1JGfR+HF7Zt+2fqMH5_4eK9X4g@mail.gmail.com>
Date: Wed, 16 Oct 2024 09:24:05 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com,
willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
eddyz87@...il.com, song@...nel.org, yonghong.song@...ux.dev,
john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
haoluo@...gle.com, jolsa@...nel.org, bpf@...r.kernel.org,
netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v2 06/12] net-timestamp: introduce
TS_SCHED_OPT_CB to generate dev xmit timestamp
On Wed, Oct 16, 2024 at 9:01 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> On 10/11/24 9:06 PM, Jason Xing wrote:
> > From: Jason Xing <kernelxing@...cent.com>
> >
> > Introduce BPF_SOCK_OPS_TS_SCHED_OPT_CB flag so that we can decide to
> > print timestamps when the skb just passes the dev layer.
> >
> > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > ---
> > include/uapi/linux/bpf.h | 5 +++++
> > net/core/skbuff.c | 17 +++++++++++++++--
> > tools/include/uapi/linux/bpf.h | 5 +++++
> > 3 files changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 157e139ed6fc..3cf3c9c896c7 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -7019,6 +7019,11 @@ enum {
> > * by the kernel or the
> > * earlier bpf-progs.
> > */
> > + BPF_SOCK_OPS_TS_SCHED_OPT_CB, /* Called when skb is passing through
> > + * dev layer when SO_TIMESTAMPING
> > + * feature is on. It indicates the
> > + * recorded timestamp.
> > + */
> > };
> >
> > /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 3a4110d0f983..16e7bdc1eacb 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -5632,8 +5632,21 @@ static void bpf_skb_tstamp_tx_output(struct sock *sk, int tstype)
> > return;
> >
> > tp = tcp_sk(sk);
> > - if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG))
> > - return;
> > + if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_TX_TIMESTAMPING_OPT_CB_FLAG)) {
> > + struct timespec64 tstamp;
> > + u32 cb_flag;
> > +
> > + switch (tstype) {
> > + case SCM_TSTAMP_SCHED:
> > + cb_flag = BPF_SOCK_OPS_TS_SCHED_OPT_CB;
> > + break;
> > + default:
> > + return;
> > + }
> > +
> > + tstamp = ktime_to_timespec64(ktime_get_real());
> > + tcp_call_bpf_2arg(sk, cb_flag, tstamp.tv_sec, tstamp.tv_nsec);
>
> There is bpf_ktime_get_*() helper. The bpf prog can directly call the
> bpf_ktime_get_* helper and use whatever clock it sees fit instead of enforcing
> real clock here and doing an extra ktime_to_timespec64. Right now the
> bpf_ktime_get_*() does not have real clock which I think it can be added.
In this way, there is no need to add tcp_call_bpf_*arg() to pass
timestamp to userspace, right? Let the bpf program implement it.
Now I wonder what information I should pass? Sorry for the lack of BPF
related knowledge :(
>
> I think overall the tstamp reporting interface does not necessarily have to
> follow the socket API. The bpf prog is running in the kernel. It could pass
> other information to the bpf prog if it sees fit. e.g. the bpf prog could also
> get the original transmitted tcp skb if it is useful.
Good to know that! But how the BPF program parses the skb by using
tcp_call_bpf_2arg() which only passes u32 parameters.
Thanks,
Jason
Powered by blists - more mailing lists