[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoC8qW_N62U9z+eQWsDwQ-w6f9Voy87E2a5MJC5C71fSYA@mail.gmail.com>
Date: Thu, 6 Mar 2025 16:19:13 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, Jason Xing <kernelxing@...cent.com>,
Simon Horman <horms@...nel.org>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 1/2] inet: change lport contribution to
inet_ehashfn() and inet6_ehashfn()
On Thu, Mar 6, 2025 at 4:14 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Thu, Mar 6, 2025 at 8:54 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> >
> > On Wed, Mar 5, 2025 at 11:46 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > In order to speedup __inet_hash_connect(), we want to ensure hash values
> > > for <source address, port X, destination address, destination port>
> > > are not randomly spread, but monotonically increasing.
> > >
> > > Goal is to allow __inet_hash_connect() to derive the hash value
> > > of a candidate 4-tuple with a single addition in the following
> > > patch in the series.
> > >
> > > Given :
> > > hash_0 = inet_ehashfn(saddr, 0, daddr, dport)
> > > hash_sport = inet_ehashfn(saddr, sport, daddr, dport)
> > >
> > > Then (hash_sport == hash_0 + sport) for all sport values.
> > >
> > > As far as I know, there is no security implication with this change.
> >
> > Good to know this. The moment I read the first paragraph, I was
> > thinking if it might bring potential risk.
> >
> > Sorry that I hesitate to bring up one question: could this new
> > algorithm result in sockets concentrating into several buckets instead
> > of being sufficiently dispersed like before.
>
> As I said, I see no difference for servers, since their sport is a fixed value.
>
> What matters for them is the hash contribution of the remote address and port,
> because the server port is usually well known.
>
> This change does not change the hash distribution, an attacker will not be able
> to target a particular bucket.
Point taken. Thank you very much for the explanation.
Thanks,
Jason
>
> > Well good news is that I
> > tested other cases like TCP_CRR and saw no degradation in performance.
> > But they didn't cover establishing from one client to many different
> > servers cases.
> >
> > >
> > > After this patch, when __inet_hash_connect() has to try XXXX candidates,
> > > the hash table buckets are contiguous and packed, allowing
> > > a better use of cpu caches and hardware prefetchers.
> > >
> > > Tested:
> > >
> > > Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog
> > > Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server
> > >
> > > Before this patch:
> > >
> > > utime_start=0.271607
> > > utime_end=3.847111
> > > stime_start=18.407684
> > > stime_end=1997.485557
> > > num_transactions=1350742
> > > latency_min=0.014131929
> > > latency_max=17.895073144
> > > latency_mean=0.505675853
> > > latency_stddev=2.125164772
> > > num_samples=307884
> > > throughput=139866.80
> > >
> > > perf top on client:
> > >
> > > 56.86% [kernel] [k] __inet6_check_established
> > > 17.96% [kernel] [k] __inet_hash_connect
> > > 13.88% [kernel] [k] inet6_ehashfn
> > > 2.52% [kernel] [k] rcu_all_qs
> > > 2.01% [kernel] [k] __cond_resched
> > > 0.41% [kernel] [k] _raw_spin_lock
> > >
> > > After this patch:
> > >
> > > utime_start=0.286131
> > > utime_end=4.378886
> > > stime_start=11.952556
> > > stime_end=1991.655533
> > > num_transactions=1446830
> > > latency_min=0.001061085
> > > latency_max=12.075275028
> > > latency_mean=0.376375302
> > > latency_stddev=1.361969596
> > > num_samples=306383
> > > throughput=151866.56
> > >
> > > perf top:
> > >
> > > 50.01% [kernel] [k] __inet6_check_established
> > > 20.65% [kernel] [k] __inet_hash_connect
> > > 15.81% [kernel] [k] inet6_ehashfn
> > > 2.92% [kernel] [k] rcu_all_qs
> > > 2.34% [kernel] [k] __cond_resched
> > > 0.50% [kernel] [k] _raw_spin_lock
> > > 0.34% [kernel] [k] sched_balance_trigger
> > > 0.24% [kernel] [k] queued_spin_lock_slowpath
> > >
> > > There is indeed an increase of throughput and reduction of latency.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> >
> > Tested-by: Jason Xing <kerneljasonxing@...il.com>
> > Reviewed-by: Jason Xing <kerneljasonxing@...il.com>
> >
> > Throughput goes from 12829 to 26072.. The percentage increase - 103% -
> > is alluring to me!
> >
> > Thanks,
> > Jason
Powered by blists - more mailing lists