[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160311153406.GB6620@breakpoint.cc>
Date: Fri, 11 Mar 2016 16:34:06 +0100
From: Florian Westphal <fw@...len.de>
To: "Yuriy M. Kaminskiy" <yumkam@...il.com>
Cc: netdev@...r.kernel.org, containers@...ts.osdl.org,
linux-kernel@...r.kernel.org
Subject: Re: userns, netns, and quick physical memory consumption by
unprivileged user
Yuriy M. Kaminskiy <yumkam@...il.com> wrote:
> BTW, all those hash/conntrack/etc default sizes was calculated from
> physical memory size in assumption there will be only *one* instance of
> those tables. Obviously, introduction of network namespaces (and
> especially unprivileged user-ns) thrown this assumption in the window
> (and here comes that "falling back to vmalloc" message again; in pre-netns
> world, those tables were allocated *once* on early system startup, with
> typically plenty of free and unfragmented memory).
No idea how to fix this expect by removing conntrack support in net
namespaces completely.
I'd disallow all write accesses to skb->nfct (NAT, CONNMARK,
CONNSECMARK, ...) and then no longer clear skb->nfct when forwarding
packet from init_ns to container.
Containers could then still test conntrack as seen from init namespace pov
in PREROUTING/FORWARD/INPUT (but not OUTPUT, obviously).
[ OUTPUT *might* be doable as well by allowing NEW creation in output
but skipping nat and deferring the confirmation/commit of the new
entry to the table until skb leaves initns ]
We could key conntrack entries to initns conntrack table
instead of adding one new table per netns, but seems like this only
replaces one problem with a new one (filling/blocking initns table from
another netns).
Maybe we could go with a compromise and skip/disallow conntrack in
unpriv userns only?
Powered by blists - more mailing lists