lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 26 Sep 2014 16:37:26 -0600 From: David Ahern <dsahern@...il.com> To: "Eric W. Biederman" <ebiederm@...ssion.com> CC: nicolas.dichtel@...nd.com, "netdev@...r.kernel.org" <netdev@...r.kernel.org> Subject: VRFs and the scalability of namespaces Hi Eric: As you suggested [1] I am starting a new thread to discuss scalability problems using namespaces for VRFs. Background ---------- Consider a single system that wants to provide VRF-based features with support for N VRFs. N could easily be 2048 (e.g., 6Wind, [2]), 4000 (e.g., Cisco, [3]) or even higher. The single system with support for N VRFs runs M services (e.g., quagga, cdp, lldp, stp, strongswan, some homegrown routing protocol) and includes standard system services like sshd. Furthermore, a system also includes monitoring programs like snmpd and tcollector. In short, M is easily 20 processes that need to have a presence across all VRFs. Network Namespaces for VRFs --------------------------- For the past 4 years or so the response to VRF questions is a drum beat of "use network namespaces". But namespaces are not a good match for VRFs. 1. Network namespaces are a complete separation of the networking stack from network devices up. VRFs are an L3 concept. Using namespaces forces an L3 separation concept onto L2 apps -- lldp, cdp, etc. There are use cases when you want device level separation, use cases where you want only L3 and up separation, and cases where you want both (e.g., divy up the netdevices in a system across some small number of namespaces and then provide VRF based features within a namespace). 2. Scalability of apps providing service as namespaces are created. How do you create the presence for each service in a network namespace? a. Spawn a new process for each namespace? brute force approach and extremely resource intensive. e.g., the quagga example [4] b. spawn a thread for each namespace? Better than a full process but still a heavyweight solution c. create a socket per namespace. Better but still this is a resource intensive solution -- N listen sockets per service and each service needs to be modified for namespace support. For opensource software that means each project has to agree that namespace awareness is relevant and agree to take the patches. 3. Just creating a network namespace consumes non-negligible amount of memory -- ~200kB for the 3.10 kernel. I believe the /proc entries are the bulk of that memory usage. 200kB/namespace is again a lot of wasted memory and overhead. 4. For a single process to straddle multiple namespaces it has to run with full root privileges -- CAP_SYS_ADMIN -- to use setns. Using network sockets does not require a process to run as root at all unless it wants privileged ports in which case CAP_NET_BIND_SERVICE is sufficient, not full root. The Linux kernel needs proper VRF support -- as an L3 concept. A capability to run a process in a "VRF any" context provides a resource efficient solution where a single process with a single listen socket works across all VRFs in a namespace and then connected sockets have a specific VRF context. Before droning on even more, does the above provide better context on the general problem? Thanks, David [1] https://lkml.org/lkml/2014/9/26/840 [2] http://www.6wind.com/6windgate-performance/ip-forwarding [3] http://www.cisco.com/c/en/us/td/docs/switches/datacenter/sw/verified_scalability/b_Cisco_Nexus_7000_Series_NX-OS_Verified_Scalability_Guide.html [4] https://lists.quagga.net/pipermail/quagga-users/2010-February/011351.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists