lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c739f928-86a2-46f8-b92e-86366758bb82@orange.com>
Date: Tue, 24 Sep 2024 16:06:36 +0200
From: Alexandre Ferrieux <alexandre.ferrieux@...il.com>
To: Eric Dumazet <edumazet@...gle.com>,
 Alexandre Ferrieux <alexandre.ferrieux@...il.com>
Cc: Simon Horman <horms@...nel.org>,
 Przemek Kitszel <przemyslaw.kitszel@...el.com>, netdev@...r.kernel.org
Subject: Massive hash collisions on FIB

On 17/09/2024 08:59, Eric Dumazet wrote:
> 
>> What do you think ?
> 
> I do not see any blocker for making things more scalable.
> 
> It is only a matter of time and interest. I think that 99.99 % of
> linux hosts around the world
> have less than 10 netns.
> 
> RTNL removal is a little bit harder (and we hit RTNL contention even
> with less than 10 netns around)

Given this encouragement, I'm proceeding towards the the "million-tunnel baby".
And knowing where the current road bumps are, workarounds are possible: instead
of a direct 1M fanout of (netns+interface), I'm doing 10k netns with 100
interfaces each, which works like a charm.

But doing this I met an entirely new kind of bottleneck: the single FIB
hashtable, shared by all netns, lends itself to massive collision if many netns
contain the same local address.

Indeed, in this situation, the fib_inetaddr_notifier ends up inserting a local
route for the address, and the only "moving part" in the hash input is the
address itself.

As an example, after creating 7000 veth pairs and moving their "right half" to
7000 namespaces, an "ip addr add 192.168.1.2/32 dev $D" on one of them hits a
bucket of depth 7000.

To solve this, I'd naively inject a few bits of entropy from the netns itself
(inode number, middle bits of (struct net *) address, etc.), by XORing them to
the hash value. Unless I'm mistaken, the netns is always unambiguous when a FIB
decision is made, be it for a packet or for some interface configuration task.

Would that be acceptable ?



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ