[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSf_GL2gfGnDnZiVzHpjbV6+bw25Pi-FMNdUGH4np9=N3Q@mail.gmail.com>
Date: Mon, 18 Nov 2019 15:29:25 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: Network Development <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Edward Cree <ecree@...arflare.com>
Subject: Re: [PATCH net-next v2 1/2] ipv6: introduce and uses route look hints
for list input
On Mon, Nov 18, 2019 at 6:03 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> When doing RX batch packet processing, we currently always repeat
> the route lookup for each ingress packet. If policy routing is
> configured, and IPV6_SUBTREES is disabled at build time, we
> know that packets with the same destination address will use
> the same dst.
>
> This change tries to avoid per packet route lookup caching
> the destination address of the latest successful lookup, and
> reusing it for the next packet when the above conditions are
> in place. Ingress traffic for most servers should fit.
>
> The measured performance delta under UDP flood vs a recvmmsg
> receiver is as follow:
>
> vanilla patched delta
> Kpps Kpps %
> 1431 1664 +14
Since IPv4 speed-up is almost half and code considerably more complex,
maybe only do IPv6?
>
> In the worst-case scenario - each packet has a different
> destination address - the performance delta is within noise
> range.
>
> v1 -> v2:
> - fix build issue with !CONFIG_IPV6_MULTIPLE_TABLES
> - fix potential race when fib6_has_custom_rules is set
> while processing a packet batch
>
> Signed-off-by: Paolo Abeni <pabeni@...hat.com>
> ---
> net/ipv6/ip6_input.c | 40 ++++++++++++++++++++++++++++++++++++----
> 1 file changed, 36 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> index ef7f707d9ae3..f559ad6b09ef 100644
> --- a/net/ipv6/ip6_input.c
> +++ b/net/ipv6/ip6_input.c
> @@ -44,10 +44,16 @@
> #include <net/inet_ecn.h>
> #include <net/dst_metadata.h>
>
> +struct ip6_route_input_hint {
> + unsigned long refdst;
> + struct in6_addr daddr;
> +};
> +
> INDIRECT_CALLABLE_DECLARE(void udp_v6_early_demux(struct sk_buff *));
> INDIRECT_CALLABLE_DECLARE(void tcp_v6_early_demux(struct sk_buff *));
> static void ip6_rcv_finish_core(struct net *net, struct sock *sk,
> - struct sk_buff *skb)
> + struct sk_buff *skb,
> + struct ip6_route_input_hint *hint)
> {
> void (*edemux)(struct sk_buff *skb);
>
> @@ -59,7 +65,13 @@ static void ip6_rcv_finish_core(struct net *net, struct sock *sk,
> INDIRECT_CALL_2(edemux, tcp_v6_early_demux,
> udp_v6_early_demux, skb);
> }
> - if (!skb_valid_dst(skb))
> +
> + if (skb_valid_dst(skb))
> + return;
> +
> + if (hint && ipv6_addr_equal(&hint->daddr, &ipv6_hdr(skb)->daddr))
> + __skb_dst_copy(skb, hint->refdst);
> + else
> ip6_route_input(skb);
Is it possible to do the address comparison in ip6_list_rcv_finish
itself and pass a pointer to refdst if safe? To avoid new struct
definition, memcpy and to have all logic in one place. Need to
keep a pointer to the prev skb, then, instead.
> static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
> struct list_head *head)
> {
> + struct ip6_route_input_hint _hint, *hint = NULL;
> struct dst_entry *curr_dst = NULL;
> struct sk_buff *skb, *next;
> struct list_head sublist;
> @@ -104,9 +127,18 @@ static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
> skb = l3mdev_ip6_rcv(skb);
> if (!skb)
> continue;
> - ip6_rcv_finish_core(net, sk, skb);
> + ip6_rcv_finish_core(net, sk, skb, hint);
> dst = skb_dst(skb);
> if (curr_dst != dst) {
> + if (ip6_can_cache_route_hint(net)) {
> + _hint.refdst = skb->_skb_refdst;
> + memcpy(&_hint.daddr, &ipv6_hdr(skb)->daddr,
> + sizeof(_hint.daddr));
> + hint = &_hint;
> + } else {
> + hint = NULL;
> + }
not needed. ip6_can_cache_route_hit is the same for all iterations of
the loop (indeed, compile time static), so if false, hint is never
set.
> +
> /* dispatch old sublist */
> if (!list_empty(&sublist))
> ip6_sublist_rcv_finish(&sublist);
> --
> 2.21.0
>
Powered by blists - more mailing lists