[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db6ecdc4-8053-42d6-89cc-39c70b199bde@intel.com>
Date: Mon, 16 Sep 2024 12:13:35 +0200
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Alexandre Ferrieux <alexandre.ferrieux@...il.com>
CC: Alexandre Ferrieux <alexandre.ferrieux@...nge.com>, <horms@...nel.org>,
Eric Dumazet <edumazet@...gle.com>, <netdev@...r.kernel.org>
Subject: Re: RFC: Should net namespaces scale up (>10k) ?
On 9/15/24 22:49, Alexandre Ferrieux wrote:
> (thanks Simon, reposting with another account to avoid the offending disclaimer)
>
> Hi,
>
> Currently, netns don't really scale beyond a few thousands, for
> mundane reasons (see below). But should they ? Is there, in the
> design, an assumption that tens of thousands of network namespaces are
> considered "unreasonable" ?
>
> A typical use case for such ridiculous numbers is a tester for
> firewalls or carrier-grade NATs. In these, you typically want tens of
> thousands of tunnels, each of which is perfectly instantiated as an
> interface. And, to avoid an explosion in source routing rules, you
> want them in separate namespaces.
>
> Now why don't they scale *today* ? For two independent, seemingly
> accidental, O(N) scans of the netns list.
>
> 1. The "netdevice notifier" from the Wireless Extensions subsystem
> insists on scanning the whole list regardless of the nature of the
> change, nor wondering whether all these namespaces hold any wireless
> interface, nor even whether the system has _any_ wireless hardware...
>
> for_each_net(net) {
> while ((skb = skb_dequeue(&net->wext_nlevents)))
> rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL,
> GFP_KERNEL);
> }
>
> 2. When moving an interface (eg an IPVLAN slave) to another netns,
> __dev_change_net_namespace() calls peernet2id_alloc() in order to get
> an ID for the target namespace. This again incurs a full scan of the
> netns list:
>
> int id = idr_for_each(&net->netns_ids, net_eq_idr, peer);
this piece is inside of __peernet2id(), which is called in for_each_net
loop, making it O(n^2):
548│ for_each_net(tmp) {
549│ int id;
550│
551│ spin_lock_bh(&tmp->nsid_lock);
552│ id = __peernet2id(tmp, net);
>
> Note that, while IDR is very fast when going from ID to pointer, the
> reverse path is awfully slow... But why are IDs needed in the first
> place, instead of the simple netns pointers ?
>
> Any insight on the (possibly very good) reasons those two apparent
> warts stand in the way of netns scaling up ?
>
> -Alex
>
I guess that the reason is more pragmatic, net namespaces are decade
older than xarray, thus list-based implementation.
Powered by blists - more mailing lists