lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240916140130.GB415778@kernel.org>
Date: Mon, 16 Sep 2024 15:01:30 +0100
From: Simon Horman <horms@...nel.org>
To: Przemek Kitszel <przemyslaw.kitszel@...el.com>
Cc: Alexandre Ferrieux <alexandre.ferrieux@...il.com>,
	Alexandre Ferrieux <alexandre.ferrieux@...nge.com>,
	Eric Dumazet <edumazet@...gle.com>, netdev@...r.kernel.org
Subject: Re: RFC: Should net namespaces scale up (>10k) ?

On Mon, Sep 16, 2024 at 12:13:35PM +0200, Przemek Kitszel wrote:
> On 9/15/24 22:49, Alexandre Ferrieux wrote:
> > (thanks Simon, reposting with another account to avoid the offending disclaimer)
> > 
> > Hi,
> > 
> > Currently, netns don't really scale beyond a few thousands, for
> > mundane reasons (see below). But should they ? Is there, in the
> > design, an assumption that tens of thousands of network namespaces are
> > considered "unreasonable" ?
> > 
> > A typical use case for such ridiculous numbers is a tester for
> > firewalls or carrier-grade NATs. In these, you typically want tens of
> > thousands of tunnels, each of which is perfectly instantiated as an
> > interface. And, to avoid an explosion in source routing rules, you
> > want them in separate namespaces.
> > 
> > Now why don't they scale *today* ? For two independent, seemingly
> > accidental, O(N) scans of the netns list.
> > 
> > 1. The "netdevice notifier" from the Wireless Extensions subsystem
> > insists on scanning the whole list regardless of the nature of the
> > change, nor wondering whether all these namespaces hold any wireless
> > interface, nor even whether the system has _any_ wireless hardware...
> > 
> >          for_each_net(net) {
> >                  while ((skb = skb_dequeue(&net->wext_nlevents)))
> >                          rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL,
> >                                      GFP_KERNEL);
> >          }
> > 
> > 2. When moving an interface (eg an IPVLAN slave) to another netns,
> > __dev_change_net_namespace() calls peernet2id_alloc() in order to get
> > an ID for the target namespace. This again incurs a full scan of the
> > netns list:
> > 
> >          int id = idr_for_each(&net->netns_ids, net_eq_idr, peer);
> 
> this piece is inside of __peernet2id(), which is called in for_each_net
> loop, making it O(n^2):
> 
>  548│         for_each_net(tmp) {
>  549│                 int id;
>  550│
>  551│                 spin_lock_bh(&tmp->nsid_lock);
>  552│                 id = __peernet2id(tmp, net);
> 
> > 
> > Note that, while IDR is very fast when going from ID to pointer, the
> > reverse path is awfully slow... But why are IDs needed in the first
> > place, instead of the simple netns pointers ?
> > 
> > Any insight on the (possibly very good) reasons those two apparent
> > warts stand in the way of netns scaling up ?
> > 
> > -Alex
> > 
> 
> I guess that the reason is more pragmatic, net namespaces are decade
> older than xarray, thus list-based implementation.

Yes, I would also guess that the reason is not that these limitations were
part of the design. But just that the implementation scaled sufficiently at
the time. And that if further scale is required, then the implementation
can be updated.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ