[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKLgA_BhiXik2_Xq4HMmA4vnU3JHC8CEsaH6dvD9QK_ng@mail.gmail.com>
Date: Thu, 6 Mar 2025 09:14:41 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, Jason Xing <kernelxing@...cent.com>,
Simon Horman <horms@...nel.org>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 1/2] inet: change lport contribution to
inet_ehashfn() and inet6_ehashfn()
On Thu, Mar 6, 2025 at 8:54 AM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> On Wed, Mar 5, 2025 at 11:46 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > In order to speedup __inet_hash_connect(), we want to ensure hash values
> > for <source address, port X, destination address, destination port>
> > are not randomly spread, but monotonically increasing.
> >
> > Goal is to allow __inet_hash_connect() to derive the hash value
> > of a candidate 4-tuple with a single addition in the following
> > patch in the series.
> >
> > Given :
> > hash_0 = inet_ehashfn(saddr, 0, daddr, dport)
> > hash_sport = inet_ehashfn(saddr, sport, daddr, dport)
> >
> > Then (hash_sport == hash_0 + sport) for all sport values.
> >
> > As far as I know, there is no security implication with this change.
>
> Good to know this. The moment I read the first paragraph, I was
> thinking if it might bring potential risk.
>
> Sorry that I hesitate to bring up one question: could this new
> algorithm result in sockets concentrating into several buckets instead
> of being sufficiently dispersed like before.
As I said, I see no difference for servers, since their sport is a fixed value.
What matters for them is the hash contribution of the remote address and port,
because the server port is usually well known.
This change does not change the hash distribution, an attacker will not be able
to target a particular bucket.
> Well good news is that I
> tested other cases like TCP_CRR and saw no degradation in performance.
> But they didn't cover establishing from one client to many different
> servers cases.
>
> >
> > After this patch, when __inet_hash_connect() has to try XXXX candidates,
> > the hash table buckets are contiguous and packed, allowing
> > a better use of cpu caches and hardware prefetchers.
> >
> > Tested:
> >
> > Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog
> > Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server
> >
> > Before this patch:
> >
> > utime_start=0.271607
> > utime_end=3.847111
> > stime_start=18.407684
> > stime_end=1997.485557
> > num_transactions=1350742
> > latency_min=0.014131929
> > latency_max=17.895073144
> > latency_mean=0.505675853
> > latency_stddev=2.125164772
> > num_samples=307884
> > throughput=139866.80
> >
> > perf top on client:
> >
> > 56.86% [kernel] [k] __inet6_check_established
> > 17.96% [kernel] [k] __inet_hash_connect
> > 13.88% [kernel] [k] inet6_ehashfn
> > 2.52% [kernel] [k] rcu_all_qs
> > 2.01% [kernel] [k] __cond_resched
> > 0.41% [kernel] [k] _raw_spin_lock
> >
> > After this patch:
> >
> > utime_start=0.286131
> > utime_end=4.378886
> > stime_start=11.952556
> > stime_end=1991.655533
> > num_transactions=1446830
> > latency_min=0.001061085
> > latency_max=12.075275028
> > latency_mean=0.376375302
> > latency_stddev=1.361969596
> > num_samples=306383
> > throughput=151866.56
> >
> > perf top:
> >
> > 50.01% [kernel] [k] __inet6_check_established
> > 20.65% [kernel] [k] __inet_hash_connect
> > 15.81% [kernel] [k] inet6_ehashfn
> > 2.92% [kernel] [k] rcu_all_qs
> > 2.34% [kernel] [k] __cond_resched
> > 0.50% [kernel] [k] _raw_spin_lock
> > 0.34% [kernel] [k] sched_balance_trigger
> > 0.24% [kernel] [k] queued_spin_lock_slowpath
> >
> > There is indeed an increase of throughput and reduction of latency.
> >
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>
> Tested-by: Jason Xing <kerneljasonxing@...il.com>
> Reviewed-by: Jason Xing <kerneljasonxing@...il.com>
>
> Throughput goes from 12829 to 26072.. The percentage increase - 103% -
> is alluring to me!
>
> Thanks,
> Jason
Powered by blists - more mailing lists