lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Apr 2022 01:35:54 +0300
From:   Vasily Averin <vvs@...nvz.org>
To:     Shakeel Butt <shakeelb@...gle.com>,
        Michal Koutný <mkoutny@...e.com>
Cc:     Roman Gushchin <roman.gushchin@...ux.dev>,
        Vlastimil Babka <vbabka@...e.cz>, kernel@...nvz.org,
        Florian Westphal <fw@...len.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Michal Hocko <mhocko@...e.com>,
        Cgroups <cgroups@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Tejun Heo <tj@...nel.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Iurii Zaikin <yzaikin@...gle.com>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH] memcg: accounting for objects allocated for new netdevice

On 4/27/22 19:52, Shakeel Butt wrote:
> On Wed, Apr 27, 2022 at 7:01 AM Michal Koutný <mkoutny@...e.com> wrote:
>>
>> Hello Vasily.
>>
>> On Wed, Apr 27, 2022 at 01:37:50PM +0300, Vasily Averin <vvs@...nvz.org> wrote:
>>> diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
>>> index cfa79715fc1a..2881aeeaa880 100644
>>> --- a/fs/kernfs/mount.c
>>> +++ b/fs/kernfs/mount.c
>>> @@ -391,7 +391,7 @@ void __init kernfs_init(void)
>>>  {
>>>       kernfs_node_cache = kmem_cache_create("kernfs_node_cache",
>>>                                             sizeof(struct kernfs_node),
>>> -                                           0, SLAB_PANIC, NULL);
>>> +                                           0, SLAB_PANIC | SLAB_ACCOUNT, NULL);
>>
>> kernfs accounting you say?
>> kernfs backs up also cgroups, so the parent-child accounting comes to my
>> mind.
>> See the temporary switch to parent memcg in mem_cgroup_css_alloc().
>>
>> (I mean this makes some sense but I'd suggest unlumping the kernfs into
>> a separate path for possible discussion and its not-only-netdevice
>> effects.)
> 
> I agree with Michal that kernfs accounting should be its own patch.
> Internally at Google, we actually have enabled the memcg accounting of
> kernfs nodes. We have workloads which create 100s of subcontainers and
> without memcg accounting of kernfs we see high system overhead.

I had this idea (i.e. move kernfs accounting into separate patch) too, 
but finally decided to include it into current patch.

Kernfs accounting is critical for described scenario. Without it typical
netdevice creating will charge only ~50% of allocated memory, and the rest
of patch does not allow to protect the host properly.

Now I'm going to follow your recommendation and split the patch.

Thank you,
	Vasily Averin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ