netdev - RFC: Should net namespaces scale up (>10k) ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <c35a227c-6a3d-47c8-95f0-6fd6d41454c5@orange.com>
Date: Sun, 15 Sep 2024 00:34:05 +0200
From: alexandre.ferrieux@...nge.com
To: netdev@...r.kernel.org
Cc: Eric Dumazet <edumazet@...gle.com>
Subject: RFC: Should net namespaces scale up (>10k) ?

Hi,

Currently, netns don't really scale beyond a few thousands, for mundane reasons 
(see below). But should they ? Is there, in the design, an assumption that tens 
of thousands of network namespaces are considered "unreasonable" ?

A typical use case for such ridiculous numbers is a tester for firewalls or 
carrier-grade NATs. In these, you typically want tens of thousands of tunnels, 
each of which is perfectly instantiated as an interface. And, to avoid an 
explosion in source routing rules, you want them in separate namespaces.

Now why don't they scale *today* ? For two independent, seemingly accidental, 
O(N) scans of the netns list.

1. The "netdevice notifier" from the Wireless Extensions subsystem insists on 
scanning the whole list regardless of the nature of the change, nor wondering 
whether all these namespaces hold any wireless interface, nor even whether the 
system has _any_ wireless hardware...

         for_each_net(net) {
                 while ((skb = skb_dequeue(&net->wext_nlevents)))
                         rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL,
                                     GFP_KERNEL);
         }

2. When moving an interface (eg an IPVLAN slave) to another netns, 
__dev_change_net_namespace() calls peernet2id_alloc() in order to get an ID for 
the target namespace. This again incurs a full scan of the netns list:

         int id = idr_for_each(&net->netns_ids, net_eq_idr, peer);

Note that, while IDR is very fast when going from ID to pointer, the reverse 
path is awfully slow... But why are IDs needed in the first place, instead of 
the simple netns pointers ?

Any insight on the (possibly very good) reasons those two apparent warts stand 
in the way of netns scaling up ?

-Alex
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.