linux-kernel - Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Tue, 4 Sep 2018 23:34:11 +0300
From:   Vladimir Davydov <vdavydov.dev@...il.com>
To:     Roman Gushchin <guro@...com>
Cc:     Michal Hocko <mhocko@...nel.org>, Rik van Riel <riel@...riel.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        kernel-team@...com, Josef Bacik <jbacik@...com>,
        Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] mm: slowly shrink slabs with a relatively small number
 of objects

On Tue, Sep 04, 2018 at 10:52:46AM -0700, Roman Gushchin wrote:
> Reparenting of all pages is definitely an option to consider,

Reparenting pages would be great indeed, but I'm not sure we could do
that, because we'd have to walk over page lists of semi-active kmem
caches and do it consistently while some pages may be freed as we go.
Kmem caches are so optimized for performance that implementing such a
procedure without impacting any hot paths would be nearly impossible
IMHO. And there are two implementations (SLAB/SLUB), both of which we'd
have to support.

> but it's not free in any case, so if there is no problem,
> why should we? Let's keep it as a last measure. In my case,
> the proposed patch works perfectly: the number of dying cgroups
> jumps around 100, where it grew steadily to 2k and more before.
> 
> I believe that reparenting of LRU lists is required to minimize
> the number of LRU lists to scan, but I'm not sure.

AFAIR the sole purpose of LRU reparenting is releasing kmemcg_id as soon
as a cgroup directory is deleted. If we didn't do that, dead cgroups
would occupy slots in per memcg arrays (list_lru, kmem_cache) so if we
had say 10K dead cgroups, we'd have to allocate 80 KB arrays to store
per memcg data for each kmem_cache and list_lru. Back when kmem
accounting was introduced, we used kmalloc() for allocating those arrays
so growing the size up to 80 KB would result in getting ENOMEM when
trying to create a cgroup too often. Now, we fall back on vmalloc() so
may be it wouldn't be a problem...

Alternatively, I guess we could "reparent" those dangling LRU objects
not to the parent cgroup's list_lru_memcg, but instead to a special
list_lru_memcg which wouldn't be assigned to any cgroup and which would
be reclaimed ASAP on both global or memcg pressure.