[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSfBTQ+G3i6j8LPi7PHZWnSx5msdMYoUURdp5Z2d3S6gDA@mail.gmail.com>
Date: Tue, 7 Dec 2021 19:44:05 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Martin KaFai Lau <kafai@...com>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
netdev@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
David Miller <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, kernel-team@...com
Subject: Re: [RFC PATCH net-next 2/2] net: Reset forwarded skb->tstamp before
delivering to user space
> > > -static inline ktime_t skb_get_ktime(const struct sk_buff *skb)
> > > +static inline ktime_t skb_get_ktime(struct sk_buff *skb)
> > > {
> > > + if (unlikely(skb->fwd_tstamp))
> > > + net_timestamp_set(skb);
> > > return ktime_mono_to_real_cond(skb->tstamp);
> >
> > This changes timestamp behavior for existing applications, probably
> > worth mentioning in the commit message if nothing else. A timestamp
> > taking at the time of the recv syscall is not very useful.
> >
> > If a forwarded timestamp is not a future delivery time (as those are
> > scrubbed), is it not correct to just deliver the original timestamp?
> > It probably was taken at some earlier __netif_receive_skb_core.
> Make sense. I will compare with the current mono clock first before
> resetting and also mention this behavior change in the commit message.
>
> Do you think it will be too heavy to always compare with
> the current time without testing the skb->fwd_tstamp bit
> first?
There are other examples of code using ktime_get and variants in the
hot path, such as FQ.
Especially if skb_get_ktime is called in recv() timestamp helpers, it
is perhaps acceptable. If not ideal. If we need an skb bit anyway,
then this is moot.
> >
> > > }
> > >
> > > -static inline void net_timestamp_set(struct sk_buff *skb)
> > > +void net_timestamp_set(struct sk_buff *skb)
> > > {
> > > skb->tstamp = 0;
> > > + skb->fwd_tstamp = 0;
> > > if (static_branch_unlikely(&netstamp_needed_key))
> > > __net_timestamp(skb);
> > > }
> > > +EXPORT_SYMBOL(net_timestamp_set);
> > >
> > > #define net_timestamp_check(COND, SKB) \
> > > if (static_branch_unlikely(&netstamp_needed_key)) { \
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index f091c7807a9e..181ddc989ead 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -5295,8 +5295,12 @@ void skb_scrub_tstamp(struct sk_buff *skb)
> > > {
> > > struct sock *sk = skb->sk;
> > >
> > > - if (sk && sk_fullsock(sk) && sock_flag(sk, SOCK_TXTIME))
> > > + if (sk && sk_fullsock(sk) && sock_flag(sk, SOCK_TXTIME)) {
> >
> > There is a slight race here with the socket flipping the feature on/off.
> Right, I think it is an inherited race by relating skb->tstamp with
> a bit in sk, like the existing sch_etf.c.
> Directly setting a bit in skb when setting the skb->tstamp will help.
>
> >
> > >
> > > skb->tstamp = 0;
> > > + skb->fwd_tstamp = 0;
> > > + } else if (skb->tstamp) {
> > > + skb->fwd_tstamp = 1;
> > > + }
> >
> > SO_TXTIME future delivery times are scrubbed, but TCP future delivery
> > times are not?
> It is not too much about scrubbing future SO_TXTIME or future TCP
> delivery time for the local delivery.
The purpose of the above is to reset future delivery time whenever it
can be mistaken for a timestamp, right?
This function is called on forwarding, redirection, looping from
egress to ingress with __dev_forward_skb, etc. But then it breaks the
delivery time forwarding over veth that I thought was the purpose of
this patch series. I guess I'm a bit hazy when this is supposed to be
scrubbed exactly.
> fwd_mono_tstamp may be a better name. It is about the forwarded tstamp
> is in mono.
After your change skb->tstamp is no longer in CLOCK_REALTIME, right?
Somewhat annoyingly, that does not imply that it is always
CLOCK_MONOTONIC. Because while FQ uses that, ETF is programmed with
CLOCK_TAI.
Perhaps skb->delivery_time is the most specific description. And that
is easy to test for in skb_scrub_tstamp.
> e.g. the packet from a container-netns can be queued
> at the fq@...tns (the case described in patch 1 commit log).
> Also, the bpf@...ress@...h@...tns can now expect the skb->tstamp is in
> mono time. BPF side does not have helper returning real clock, so it is
> safe to assume that bpf prog is comparing (or setting) skb->tstamp as
> mono also.
>
> > If adding a bit, might it be simpler to add a bit tstamp_is_edt, and
> > scrub based on that. That is also not open to the above race.
> It was one of my earlier attempts by adding tstamp_is_tx_mono and
> set it in tcp_output.c and then test it before scrubbing.
> Other than changing the tcp_output.c (e.g. in __tcp_transmit_skb),
> I ended up making another change on the bpf side to also set
> this bit when the bpf_prog is updating the __sk_buff->tstamp. Thus,
> in this patch , I ended up setting a bit only in the forward path.
>
> I can go back to retry the tstamp_is_edt/tstamp_is_tx_mono idea and
> that can also avoid the race in testing sock_flag(sk, SOCK_TXTIME)
> as you suggested.
Sounds great, thanks
Powered by blists - more mailing lists