[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220212232757.bzcdp5apb4r7whd7@kafai-mbp.dhcp.thefacebook.com>
Date: Sat, 12 Feb 2022 15:27:57 -0800
From: Martin KaFai Lau <kafai@...com>
To: Cong Wang <xiyou.wangcong@...il.com>
CC: <bpf@...r.kernel.org>, <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
David Miller <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, <kernel-team@...com>,
Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH v4 net-next 1/8] net: Add skb->mono_delivery_time to
distinguish mono delivery_time from (rcv) timestamp
On Sat, Feb 12, 2022 at 11:13:53AM -0800, Cong Wang wrote:
> On Thu, Feb 10, 2022 at 11:12:38PM -0800, Martin KaFai Lau wrote:
> > skb->tstamp was first used as the (rcv) timestamp.
> > The major usage is to report it to the user (e.g. SO_TIMESTAMP).
> >
> > Later, skb->tstamp is also set as the (future) delivery_time (e.g. EDT in TCP)
> > during egress and used by the qdisc (e.g. sch_fq) to make decision on when
> > the skb can be passed to the dev.
> >
> > Currently, there is no way to tell skb->tstamp having the (rcv) timestamp
> > or the delivery_time, so it is always reset to 0 whenever forwarded
> > between egress and ingress.
> >
> > While it makes sense to always clear the (rcv) timestamp in skb->tstamp
> > to avoid confusing sch_fq that expects the delivery_time, it is a
> > performance issue [0] to clear the delivery_time if the skb finally
> > egress to a fq@...-dev. For example, when forwarding from egress to
> > ingress and then finally back to egress:
> >
> > tcp-sender => veth@...ns => veth@...tns => fq@...0@...tns
> > ^ ^
> > reset rest
> >
> > This patch adds one bit skb->mono_delivery_time to flag the skb->tstamp
> > is storing the mono delivery_time (EDT) instead of the (rcv) timestamp.
> >
> > The current use case is to keep the TCP mono delivery_time (EDT) and
> > to be used with sch_fq. A later patch will also allow tc-bpf@...ress
> > to read and change the mono delivery_time.
>
> Can you be more specific? How is the fq in the hostns even visible to
> container ns? More importantly, why the packets are always going out from
> container to eth0?
>
> From the sender's point of view, it can't see the hostns and can't event
> know whether the packets are routed to eth0 or other containers on the
> same host. So I don't see how this makes sense.
The sender does not need to know if there is fq installed anywhere or
how the packet will be routed. It is completely orthogonal.
Today, the TCP is always setting the EDT without knowing where
it will be routed and if there is fq (or any lower layer code) installed
anywhere in the routing path that will be using it.
> Crossing netns is pretty much like delivering on wire, *generally speaking*
> if the skb meta data is not preserved on wire, it probably should not for
> crossing netns either.
There are many fields in the skb that are not cleared. In general, it clears
when it is needed. e.g. skb->sk in the veth case above and sk has info
that is not even in the tcp/ip packet itself. The delivery time was
needed to be cleared because there is no way to distinguish between
the rcv timestamp and the delivery time.
Powered by blists - more mailing lists