lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46c1c59e-1368-620d-e57a-f35c2c82084d@linux.dev>
Date:   Mon, 11 Apr 2022 12:40:29 +0300
From:   Vasily Averin <vasily.averin@...ux.dev>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Vlastimil Babka <vbabka@...e.cz>, NeilBrown <neilb@...e.de>,
        Michal Hocko <mhocko@...e.com>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Linux MM <linux-mm@...ck.org>, netdev@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Tejun Heo <tj@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        David Ahern <dsahern@...nel.org>, linux-kernel@...r.kernel.org,
        kernel@...nvz.org, Luis Chamberlain <mcgrof@...nel.org>
Subject: problem with accounting of allocations called from __net_init hooks

On 3/1/22 21:09, Shakeel Butt wrote:
> On Mon, Feb 28, 2022 at 06:36:58AM -0800, Luis Chamberlain wrote:
>> On Mon, Feb 28, 2022 at 10:17:16AM +0300, Vasily Averin wrote:
>> > Following one-liner running inside memcg-limited container consumes
>> > huge number of host memory and can trigger global OOM.
>> >
>> > for i in `seq 1 xxx` ; do ip l a v$i type veth peer name vp$i ; done
>> >
>> > Patch accounts most part of these allocations and can protect host.
>> > ---[cut]---
>> > It is not polished, and perhaps should be splitted.
>> > obviously it affects other kind of netdevices too.
>> > Unfortunately I'm not sure that I will have enough time to handle it properly
>> > and decided to publish current patch version as is.
>> > OpenVz workaround it by using per-container limit for number of
>> > available netdevices, but upstream does not have any kind of
>> > per-container configuration.
>> > ------

I've noticed that __register_pernet_operations() executes init hook of registered 
pernet_operation structure in all found net namespaces.

Usually these hooks are called by process related to specified net namespace,
and all marked allocation are accounted to related container:
i.e. objects related to netns in container A are accounted to memcg of container A,
objects allocated inside container B are accounted to corresponding memcg B,
and so on.

However __register_pernet_operations() calls the same hooks in one context,
and as result all marked allocations are accounted to one memcg.
It is quite rare scenario, however current processing looks incorrect for me.

I expect we can take memcg from 'struct net', because of this structure is accounted per se.
then we can use set_active_memcg() before init hook execution.
However I'm not sure it is fully correct.

Could you please advise some better solution?

Thank you,
	Vasily Averin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ