lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 8 Mar 2024 21:53:47 -0700
From: David Ahern <dsahern@...nel.org>
To: Leone Fernando <leone4fernando@...il.com>, davem@...emloft.net,
 edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, willemb@...gle.com
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next 0/4] net: route: improve route hinting

On 3/7/24 10:11 AM, Leone Fernando wrote:
> In 2017, Paolo Abeni introduced the hinting mechanism [1] to the routing
> sub-system. The hinting optimization improves performance by reusing
> previously found dsts instead of looking them up for each skb.
> 
> This patch series introduces a generalized version of the hinting mechanism that
> can "remember" a larger number of dsts. This reduces the number of dst
> lookups for frequently encountered daddrs.
> 
> Before diving into the code and the benchmarking results, it's important
> to address the deletion of the old route cache [2] and why
> this solution is different. The original cache was complicated,
> vulnerable to DOS attacks and had unstable performance.
> 
> The new input dst_cache is much simpler thanks to its lazy approach,
> improving performance without the overhead of the removed cache
> implementation. Instead of using timers and GC, the deletion of invalid
> entries is performed lazily during their lookups.
> The dsts are stored in a simple, lightweight, static hash table. This
> keeps the lookup times fast yet stable, preventing DOS upon cache misses.
> The new input dst_cache implementation is built over the existing
> dst_cache code which supplies a fast lockless percpu behavior.
> 
> I tested this patch using udp floods with different number of daddrs.
> The benchmarking setup is comprised of 3 machines: a sender,
> a forwarder and a receiver. I measured the PPS received by the receiver
> as the forwarder was running either the mainline kernel or the patched
> kernel, comparing the results. The dst_cache I tested in this benchmark
> used a total of 512 hash table entries, split into buckets of 4
> entries each.
> 
> These are the results:
>   UDP             mainline              patched                   delta
> conns pcpu         Kpps                  Kpps                       %
>    1              274.0255              269.2205                  -1.75
>    2              257.3748              268.0947                   4.17
>   15              241.3513              258.8016                   7.23
>  100              238.3419              258.4939                   8.46
>  500              238.5390              252.6425                   5.91
> 1000              238.7570              242.1820                   1.43
> 2000              238.7780              236.2640                  -1.05
> 4000              239.0440              233.5320                  -2.31
> 8000              239.3248              232.5680                  -2.82
> 

I have looked at all of the sets sent. I can not convince myself this is
a good idea, but at the same time I do not have constructive feedback on
why it is not acceptable. The gains are modest at best.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ