lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoDb84bgUUpK9PjijWDt+xw=u2nKkoWf1Gjvkjf--XJ6VA@mail.gmail.com>
Date: Wed, 16 Oct 2024 15:54:29 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: Jakub Sitnicki <jakub@...udflare.com>, davem@...emloft.net, edumazet@...gle.com, 
	kuba@...nel.org, pabeni@...hat.com, dsahern@...nel.org, 
	willemdebruijn.kernel@...il.com, willemb@...gle.com, ast@...nel.org, 
	daniel@...earbox.net, andrii@...nel.org, eddyz87@...il.com, song@...nel.org, 
	yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org, 
	sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, bpf@...r.kernel.org, 
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v2 04/12] net-timestamp: add static key to
 control the whole bpf extension

On Wed, Oct 16, 2024 at 3:01 PM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> On 10/15/24 11:30 PM, Jason Xing wrote:
> > On Wed, Oct 16, 2024 at 2:13 PM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> >>
> >> On 10/15/24 6:32 PM, Jason Xing wrote:
> >>> On Wed, Oct 16, 2024 at 9:04 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> >>>>
> >>>> On Wed, Oct 16, 2024 at 8:10 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> >>>>>
> >>>>> On 10/11/24 9:06 PM, Jason Xing wrote:
> >>>>>> From: Jason Xing <kernelxing@...cent.com>
> >>>>>>
> >>>>>> Willem suggested that we use a static key to control. The advantage
> >>>>>> is that we will not affect the existing applications at all if we
> >>>>>> don't load BPF program.
> >>>>>>
> >>>>>> In this patch, except the static key, I also add one logic that is
> >>>>>> used to test if the socket has enabled its tsflags in order to
> >>>>>> support bpf logic to allow both cases to happen at the same time.
> >>>>>> Or else, the skb carring related timestamp flag doesn't know which
> >>>>>> way of printing is desirable.
> >>>>>>
> >>>>>> One thing important is this patch allows print from both applications
> >>>>>> and bpf program at the same time. Now we have three kinds of print:
> >>>>>> 1) only BPF program prints
> >>>>>> 2) only application program prints
> >>>>>> 3) both can print without side effect
> >>>>>>
> >>>>>> Signed-off-by: Jason Xing <kernelxing@...cent.com>
> >>>>>> ---
> >>>>>>     include/net/sock.h |  1 +
> >>>>>>     net/core/filter.c  |  3 +++
> >>>>>>     net/core/skbuff.c  | 38 ++++++++++++++++++++++++++++++++++++++
> >>>>>>     3 files changed, 42 insertions(+)
> >>>>>>
> >>>>>> diff --git a/include/net/sock.h b/include/net/sock.h
> >>>>>> index 66ecd78f1dfe..b7c51b95c92d 100644
> >>>>>> --- a/include/net/sock.h
> >>>>>> +++ b/include/net/sock.h
> >>>>>> @@ -2889,6 +2889,7 @@ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif)
> >>>>>>     void sock_def_readable(struct sock *sk);
> >>>>>>
> >>>>>>     int sock_bindtoindex(struct sock *sk, int ifindex, bool lock_sk);
> >>>>>> +DECLARE_STATIC_KEY_FALSE(bpf_tstamp_control);
> >>>>>>     void sock_set_timestamp(struct sock *sk, int optname, bool valbool);
> >>>>>>     int sock_get_timestamping(struct so_timestamping *timestamping,
> >>>>>>                           sockptr_t optval, unsigned int optlen);
> >>>>>> diff --git a/net/core/filter.c b/net/core/filter.c
> >>>>>> index 996426095bd9..08135f538c99 100644
> >>>>>> --- a/net/core/filter.c
> >>>>>> +++ b/net/core/filter.c
> >>>>>> @@ -5204,6 +5204,8 @@ static const struct bpf_func_proto bpf_get_socket_uid_proto = {
> >>>>>>         .arg1_type      = ARG_PTR_TO_CTX,
> >>>>>>     };
> >>>>>>
> >>>>>> +DEFINE_STATIC_KEY_FALSE(bpf_tstamp_control);
> >>>>>> +
> >>>>>>     static int bpf_sock_set_timestamping(struct sock *sk,
> >>>>>>                                      struct so_timestamping *timestamping)
> >>>>>>     {
> >>>>>> @@ -5217,6 +5219,7 @@ static int bpf_sock_set_timestamping(struct sock *sk,
> >>>>>>                 return -EINVAL;
> >>>>>>
> >>>>>>         WRITE_ONCE(sk->sk_tsflags[BPFPROG_TS_REQUESTOR], flags);
> >>>>>> +     static_branch_enable(&bpf_tstamp_control);
> >>>>>
> >>>>> Not sure when is a good time to do static_branch_disable().
> >>>>
> >>>> Thanks for the review.
> >>>>
> >>>> To be honest, I considered how to disable the static key. Like you
> >>>> said, I failed to find a good chance that I can accurately disable it.
> >>>>
> >>>>>
> >>>>> The bpf prog may be detached also. (IF) it ends up staying with the
> >>>>> cgroup/sockops interface, it should depend on the existing static key in
> >>>>> cgroup_bpf_enabled(CGROUP_SOCK_OPS) instead of adding another one.
> >>>>
> >>>> Are you suggesting that we need to remove the current static key? In
> >>>> the previous thread, the reason why Willem came up with this idea is,
> >>>> I think, to avoid affect the non-bpf timestamping feature.
> >>>>
> >>>>>
> >>>>>>
> >>>>>>         return 0;
> >>>>>>     }
> >>>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >>>>>> index f36eb9daa31a..d0f912f1ff7b 100644
> >>>>>> --- a/net/core/skbuff.c
> >>>>>> +++ b/net/core/skbuff.c
> >>>>>> @@ -5540,6 +5540,29 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
> >>>>>>     }
> >>>>>>     EXPORT_SYMBOL_GPL(skb_complete_tx_timestamp);
> >>>>>>
> >>>>>> +static bool sk_tstamp_tx_flags(struct sock *sk, u32 tsflags, int tstype)
> >>>>>
> >>>>> sk is unused.
> >>>>
> >>>> Thanks for the careful check.
> >>>>
> >>>>>
> >>>>>> +{
> >>>>>> +     u32 testflag;
> >>>>>> +
> >>>>>> +     switch (tstype) {
> >>>>>> +     case SCM_TSTAMP_SCHED:
> >>>>>
> >>>>> Instead of doing this translation,
> >>>>> is it easier to directly store the bpf prog desired ts"type" (i.e. the
> >>>>> SCM_TSTAMP_*) in the sk->sk_tsflags_bpf?
> >>>>> or there is a specific need to keep the SOF_TIMESTAMPING_* value in
> >>>>> sk->sk_tsflags_bpf?
> >>>>
> >>>> We have to reuse SOF_TIMESTAMPING_* because there are more flags, say,
> >>>> SOF_TIMESTAMPING_OPT_ID, that we need to support.
> >>>>
> >>>>>
> >>>>>> +             testflag = SOF_TIMESTAMPING_TX_SCHED;
> >>>>>> +             break;
> >>>>>> +     case SCM_TSTAMP_SND:
> >>>>>> +             testflag = SOF_TIMESTAMPING_TX_SOFTWARE;
> >>>>>> +             break;
> >>>>>> +     case SCM_TSTAMP_ACK:
> >>>>>> +             testflag = SOF_TIMESTAMPING_TX_ACK;
> >>>>>> +             break;
> >>>>>> +     default:
> >>>>>> +             return false;
> >>>>>> +     }
> >>>>>> +     if (tsflags & testflag)
> >>>>>> +             return true;
> >>>>>> +
> >>>>>> +     return false;
> >>>>>> +}
> >>>>>> +
> >>>>>>     static void skb_tstamp_tx_output(struct sk_buff *orig_skb,
> >>>>>>                                  const struct sk_buff *ack_skb,
> >>>>>>                                  struct skb_shared_hwtstamps *hwtstamps,
> >>>>>> @@ -5558,6 +5581,9 @@ static void skb_tstamp_tx_output(struct sk_buff *orig_skb,
> >>>>>>         if (!skb_may_tx_timestamp(sk, tsonly))
> >>>>>>                 return;
> >>>>>>
> >>>>>> +     if (!sk_tstamp_tx_flags(sk, tsflags, tstype))
> >>>>>
> >>>>> This is a new test. tsflags is the sk->sk_tsflags here if I read it correctly.
> >>>>
> >>>> This test will be used in bpf and non-bpf cases. Because of this, we
> >>>> can support BPF extension. In this function, if skb has tsflags but we
> >>>> don't know which approach the user expects, sk_tstamp_tx_flags() can
> >>>> help us.
> >>>>
> >>>>>
> >>>>> My understanding is the sendmsg can provide SOF_TIMESTAMPING_* for individual
> >>>>> skb. Would it break?
> >>>>
> >>>> Oh, you're right. I didn't support cmsg mode...
> >>>
> >>> I think I only need to test if it's in the bpf mode, or else let the
> >>> original way print the timestamp, which can solve the issue.
> >>
> >>   From looking at the existing "__skb_tstamp_tx(skb, NULL, NULL, skb->sk,
> >> SCM_TSTAMP_SCHED);":
> >>
> >> int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
> >> {
> >>          /* ... */
> >>
> >>          if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
> >>                  __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
> >>
> >>          /* ... */
> >> }
> >>
> >> I am still puzzling how __skb_tstamp_tx() will be called if only bpf has enabled
> >> the timestamping. I may have missed somewhere in the patch set that the skb's
> >> tx_flags is changed by sk->sk_tsflags_bpf alone?
> >
> > If sk_tsflags_bpf is set, tcp_sendmsg() -> tcp_tx_timestamp() will be
> > helpful, which initializes every last skb, please see patch [10/12].
>
> Ah. ok. It is the thing I missed. Thanks for the pointer.
>
> >>
> >> I think a skb tskey is still desired (?), so eventually we want some spaces in
> >
> > tskey function is optional I think. It depends whether users want to
> > use it or not. It can controlled by SOF_TIMESTAMPING_OPT_ID flag.
> >
> >> the skb for bpf. Jakub Sitnicki (cc-ed) has presented in LPC about extending
> >> skb->data_meta usage outside of xdp and tc. I think here we want to have it
> >> available at the tx side to store the tx_flags and tskey but probably want them
> >> at a specific place/offset at the data_meta.
> >
> > If we have the plan to store extra information in data_meta, I can
> > give it a try:)
> >
> >>
> >> For now, is there thing we can explore to share in the skb_shared_info?
> >
> > My initial thought is just to reuse these fields in skb. It can work
> > without interfering one another.
>
> After reading closer to patch 10, I am likely still missing something. How can
> it tell if the tx_flags is set by the bpf or by the user space cmsg?

If the skb carries the timestamp, there are three cases:
1) non-bpf case and users uses setsockopt()
2) cmsg case
3) bpf case

#1 and #2 are already handled well before this patch. I only need to
test if sk_tsflags_bpf has those flags. If so, it means we hit #3, or
else it could be #1 or #2, then we will let the old way print
timestamps in __skb_tstamp_tx().

>
> >
> >> Can the "struct skb_shared_hwtstamps hwtstamps;" be used for the bpf tx_flags and tskey
> >> only at the "tx" side? There is already another union member.
> >
> > tskey is always used in the tx path.
> >
> > hwtstamps can be used in both rx and tx cases (please see
> > tcp_update_recv_tstamps() and skb_tstamp_tx()).
>
> hmm... we only need some where to store the bpf tx_flags and bpf tskey in the
> TX-ing skb.

And there is one more field we have to take care of: txstamp_ack which
indicates whether we print timestamp when the last skb is acked.
Please see tcp_tx_timestamp().

> You meant the hwtstamps of a Tx-ing skb is not empty?

Sometimes, it's not empty if the hardware supports the timestamp
feature and the user wants to see it (by enabling the
SOF_TIMESTAMPING_TX_HARDWARE flag). As we can see, there are many
callers calling skb_tstamp_tx().

>
> At skb_tstamp_tx (TX side only?), the orig_skb's hwtstamps has not been written yet?

I'm not that sure about the orig_skb. It seems no. I can see some
callers reading ptp timestamp from the nic and pass the timestamp to
skb_tstamp_tx().

Thanks,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ