[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220122032837.lng6uwg5ugomplpx@kafai-mbp.dhcp.thefacebook.com>
Date: Fri, 21 Jan 2022 19:28:37 -0800
From: Martin KaFai Lau <kafai@...com>
To: Julian Anastasov <ja@....bg>
CC: <bpf@...r.kernel.org>, <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
David Miller <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, <kernel-team@...com>,
Willem de Bruijn <willemb@...gle.com>
Subject: Re: [RFC PATCH v3 net-next 3/4] net: Set skb->mono_delivery_time and
clear it when delivering locally
On Fri, Jan 21, 2022 at 02:02:23PM +0200, Julian Anastasov wrote:
> > diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
> > index 3a025c011971..35311ca75496 100644
> > --- a/net/ipv4/ip_input.c
> > +++ b/net/ipv4/ip_input.c
> > @@ -244,6 +244,7 @@ int ip_local_deliver(struct sk_buff *skb)
> > */
> > struct net *net = dev_net(skb->dev);
> >
> > + skb_clear_delivery_time(skb);
>
> Is it safe to move this line into ip_local_deliver_finish ?
>
> > if (ip_is_fragment(ip_hdr(skb))) {
> > if (ip_defrag(net, skb, IP_DEFRAG_LOCAL_DELIVER))
> > return 0;
> > diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> > index 80256717868e..84f93864b774 100644
> > --- a/net/ipv6/ip6_input.c
> > +++ b/net/ipv6/ip6_input.c
> > @@ -469,6 +469,7 @@ static int ip6_input_finish(struct net *net, struct sock *sk, struct sk_buff *sk
> >
> > int ip6_input(struct sk_buff *skb)
> > {
> > + skb_clear_delivery_time(skb);
>
> Is it safe to move this line into ip6_input_finish?
> The problem for both cases is that IPVS hooks at LOCAL_IN and
> can decide to forward the packet by returning NF_STOLEN and
> avoiding the _finish code. In short, before reaching the
> _finish code it is still not decided that packet reaches the
> sockets.
hmm...
Theoretically, it should be doable to push it later because the
ingress path cannot assume the (rcv) timestamp is always available,
so it should be expecting to handle the 0 case and do ktime_get_real(),
e.g. the tapping case used by af_packet. The tradeoff is just
a later (rcv) timestamp and also more code churns. e.g.
Somewhere in ip_is_fragment() may need to change.
My initial attempt was to call skb_clear_delivery_time()
right after sch_handle_ingress() in dev.c. However, it seems not taking
much to make ip[6]_forward work also, so I pushed it here. However, it
seems that will make other kernel forward paths not consistent in terms of
the expectation in keeping the delivery_time.
I will give it a try in v4 but not very sure for now before looking
closer. The worst is to move it back to just after sch_handle_ingress()
so that the kernel forward path will still behave consistently
but I will give it a try first.
Thanks for the review !
Powered by blists - more mailing lists