[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL+tcoCjZxM88TpvNDVzW+BBNU9V2a=kpBh=XZ8cHcHqsRjg1w@mail.gmail.com>
Date: Thu, 6 Mar 2025 16:22:35 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, Jason Xing <kernelxing@...cent.com>,
Simon Horman <horms@...nel.org>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 2/2] inet: call inet6_ehashfn() once from inet6_hash_connect()
On Wed, Mar 5, 2025 at 11:46 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> inet6_ehashfn() being called from __inet6_check_established()
> has a big impact on performance, as shown in the Tested section.
>
> After prior patch, we can compute the hash for port 0
> from inet6_hash_connect(), and derive each hash in
> __inet_hash_connect() from this initial hash:
>
> hash(saddr, lport, daddr, dport) == hash(saddr, 0, daddr, dport) + lport
>
> Apply the same principle for __inet_check_established(),
> although inet_ehashfn() has a smaller cost.
>
> Tested:
>
> Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog
> Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server
>
> Before this patch:
>
> utime_start=0.286131
> utime_end=4.378886
> stime_start=11.952556
> stime_end=1991.655533
> num_transactions=1446830
> latency_min=0.001061085
> latency_max=12.075275028
> latency_mean=0.376375302
> latency_stddev=1.361969596
> num_samples=306383
> throughput=151866.56
>
> perf top:
>
> 50.01% [kernel] [k] __inet6_check_established
> 20.65% [kernel] [k] __inet_hash_connect
> 15.81% [kernel] [k] inet6_ehashfn
> 2.92% [kernel] [k] rcu_all_qs
> 2.34% [kernel] [k] __cond_resched
> 0.50% [kernel] [k] _raw_spin_lock
> 0.34% [kernel] [k] sched_balance_trigger
> 0.24% [kernel] [k] queued_spin_lock_slowpath
>
> After this patch:
>
> utime_start=0.315047
> utime_end=9.257617
> stime_start=7.041489
> stime_end=1923.688387
> num_transactions=3057968
> latency_min=0.003041375
> latency_max=7.056589232
> latency_mean=0.141075048 # Better latency metrics
> latency_stddev=0.526900516
> num_samples=312996
> throughput=320677.21 # 111 % increase, and 229 % for the series
>
> perf top: inet6_ehashfn is no longer seen.
>
> 39.67% [kernel] [k] __inet_hash_connect
> 37.06% [kernel] [k] __inet6_check_established
> 4.79% [kernel] [k] rcu_all_qs
> 3.82% [kernel] [k] __cond_resched
> 1.76% [kernel] [k] sched_balance_domains
> 0.82% [kernel] [k] _raw_spin_lock
> 0.81% [kernel] [k] sched_balance_rq
> 0.81% [kernel] [k] sched_balance_trigger
> 0.76% [kernel] [k] queued_spin_lock_slowpath
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Thank you!
Tested-by: Jason Xing <kerneljasonxing@...il.com>
Reviewed-by: Jason Xing <kerneljasonxing@...il.com>
Powered by blists - more mailing lists