linux-kernel - Re: [PATCH RFC 00/14] The new slab memory controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190919162204.GA20035@castle.dhcp.thefacebook.com>
Date:   Thu, 19 Sep 2019 16:22:09 +0000
From:   Roman Gushchin <guro@...com>
To:     Suleiman Souhlal <suleiman@...gle.com>
CC:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Michal Hocko <mhocko@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>,
        "Shakeel Butt" <shakeelb@...gle.com>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        "Waiman Long" <longman@...hat.com>
Subject: Re: [PATCH RFC 00/14] The new slab memory controller

On Thu, Sep 19, 2019 at 10:39:18PM +0900, Suleiman Souhlal wrote:
> On Fri, Sep 6, 2019 at 6:57 AM Roman Gushchin <guro@...com> wrote:
> > The patchset has been tested on a number of different workloads in our
> > production. In all cases, it saved hefty amounts of memory:
> > 1) web frontend, 650-700 Mb, ~42% of slab memory
> > 2) database cache, 750-800 Mb, ~35% of slab memory
> > 3) dns server, 700 Mb, ~36% of slab memory
> 
> Do these workloads cycle through a lot of different memcgs?

Not really, those are just plain services managed by systemd.
They aren't restarted too often, maybe several times per day at most.

Also, there is nothing fb-specific. You can take any new modern
distributive (I've tried Fedora 30), boot it up and look at the
amount of slab memory. Numbers are roughly the same.

> 
> For workloads that don't, wouldn't this approach potentially use more
> memory? For example, a workload where everything is in one or two
> memcgs, and those memcgs last forever.
>

Yes, it's true, if you have a very small and fixed number of memory cgroups,
in theory the new approach can take ~10% more memory.

I don't think it's such a big problem though: it seems that the majority
of cgroup users have a lot of them, and they are dynamically created and
destroyed by systemd/kubernetes/whatever else.

And if somebody has a very special setup with only 1-2 cgroups, arguably
kernel memory accounting isn't such a big thing for them, so it can be simple
disabled. Am I wrong and do you have a real-life example?

Thanks!

Roman