lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKBzOOMSQv5U8vpRcNtEYmPtOzqOWLxNgyjAnGOC=Bx+A@mail.gmail.com>
Date: Tue, 8 Oct 2024 16:15:37 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Breno Leitao <leitao@...ian.org>
Cc: Paolo Abeni <pabeni@...hat.com>, David Ahern <dsahern@...nel.org>, 
	"David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, rmikey@...a.com, 
	kernel-team@...a.com, horms@...nel.org, 
	"open list:NETWORKING [IPv4/IPv6]" <netdev@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next] net: Optimize IPv6 path in ip_neigh_for_gw()

On Tue, Oct 8, 2024 at 4:07 PM Breno Leitao <leitao@...ian.org> wrote:
>
> Hello Paolo,
>
> On Tue, Oct 08, 2024 at 12:51:05PM +0200, Paolo Abeni wrote:
> > On 10/4/24 19:37, Breno Leitao wrote:
> > > On Fri, Oct 04, 2024 at 11:01:29AM -0600, David Ahern wrote:
> > > > On 10/4/24 10:27 AM, Breno Leitao wrote:
> > > > > Branch annotation traces from approximately 200 IPv6-enabled hosts
> > > > > revealed that the 'likely' branch in ip_neigh_for_gw() was consistently
> > > > > mispredicted. Given the increasing prevalence of IPv6 in modern networks,
> > > > > this commit adjusts the function to favor the IPv6 path.
> > > > >
> > > > > Swap the order of the conditional statements and move the 'likely'
> > > > > annotation to the IPv6 case. This change aims to improve performance in
> > > > > IPv6-dominant environments by reducing branch mispredictions.
> > > > >
> > > > > This optimization aligns with the trend of IPv6 becoming the default IP
> > > > > version in many deployments, and should benefit modern network
> > > > > configurations.
> > > > >
> > > > > Signed-off-by: Breno Leitao <leitao@...ian.org>
> > > > > ---
> > > > >   include/net/route.h | 6 +++---
> > > > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/include/net/route.h b/include/net/route.h
> > > > > index 1789f1e6640b..b90b7b1effb8 100644
> > > > > --- a/include/net/route.h
> > > > > +++ b/include/net/route.h
> > > > > @@ -389,11 +389,11 @@ static inline struct neighbour *ip_neigh_for_gw(struct rtable *rt,
> > > > >         struct net_device *dev = rt->dst.dev;
> > > > >         struct neighbour *neigh;
> > > > > -       if (likely(rt->rt_gw_family == AF_INET)) {
> > > > > -               neigh = ip_neigh_gw4(dev, rt->rt_gw4);
> > > > > -       } else if (rt->rt_gw_family == AF_INET6) {
> > > > > +       if (likely(rt->rt_gw_family == AF_INET6)) {
> > > > >                 neigh = ip_neigh_gw6(dev, &rt->rt_gw6);
> > > > >                 *is_v6gw = true;
> > > > > +       } else if (rt->rt_gw_family == AF_INET) {
> > > > > +               neigh = ip_neigh_gw4(dev, rt->rt_gw4);
> > > > >         } else {
> > > > >                 neigh = ip_neigh_gw4(dev, ip_hdr(skb)->daddr);
> > > > >         }
> > > >
> > > > This is an IPv4 function allowing support for IPv6 addresses as a
> > > > nexthop. It is appropriate for IPv4 family checks to be first.
> > >
> > > Right. In which case is this called on IPv6 only systems?
> > >
> > > On my IPv6-only 200 systems, the annotated branch predictor is showing
> > > it is mispredicted 100% of the time.
> >
> > perf probe -a ip_neigh_for_gw; perf record -e probe:ip_neigh_for_gw -ag;
> > perf script
> >
> > should give you an hint.
>
> Thanks. That proved to be very useful.
>
> As I said above, all the hosts I have a webserver running, I see this
> that likely mispredicted. Same for this server:
>
>         # cat /sys/kernel/tracing/trace_stat/branch_annotated | grep ip_neigh_for_gw
>          correct incorrect  %        Function                  File              Line
>                0    17127 100 ip_neigh_for_gw                route.h              393
>
> It is mostly coming from ip_finish_output2() and tcp_v4. Important to
> say that these machine has no IPv4 configured, except 127.0.0.1
> (localhost).

Now run the experiment on a typical server using IPv4 ?

I would advise removing the likely() if it really bothers you.
(I doubt this has any impact)

But assuming everything is IPv6 is too soon.

There are more obvious changes like :

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index b6e7d4921309741193a8c096aeb278255ec56794..445f4fe712603e8c14b1006ad4cbaac278bae4ea
100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -462,7 +462,7 @@ static struct sk_buff *ip_rcv_core(struct sk_buff
*skb, struct net *net)
        /* When the interface is in promisc. mode, drop all the crap
         * that it receives, do not try to analyse it.
         */
-       if (skb->pkt_type == PACKET_OTHERHOST) {
+       if (unlikely(skb->pkt_type == PACKET_OTHERHOST)) {
                dev_core_stats_rx_otherhost_dropped_inc(skb->dev);
                drop_reason = SKB_DROP_REASON_OTHERHOST;
                goto drop;
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 70c0e16c0ae6837d1c64d0036829c8b61799578b..3d0797afa499fa880eb5452a0dea8a23505b3e60
100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -153,7 +153,7 @@ static struct sk_buff *ip6_rcv_core(struct sk_buff
*skb, struct net_device *dev,
        u32 pkt_len;
        struct inet6_dev *idev;

-       if (skb->pkt_type == PACKET_OTHERHOST) {
+       if (unlikely(skb->pkt_type == PACKET_OTHERHOST)) {
                dev_core_stats_rx_otherhost_dropped_inc(skb->dev);
                kfree_skb_reason(skb, SKB_DROP_REASON_OTHERHOST);
                return NULL;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ