lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200203205834.GA6781@xps.dhcp.thefacebook.com>
Date:   Mon, 3 Feb 2020 12:58:34 -0800
From:   Roman Gushchin <guro@...com>
To:     Johannes Weiner <hannes@...xchg.org>
CC:     <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...nel.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        <linux-kernel@...r.kernel.org>, <kernel-team@...com>,
        Bharata B Rao <bharata@...ux.ibm.com>,
        Yafang Shao <laoar.shao@...il.com>
Subject: Re: [PATCH v2 21/28] mm: memcg/slab: use a single set of kmem_caches
 for all memory cgroups

On Mon, Feb 03, 2020 at 02:50:48PM -0500, Johannes Weiner wrote:
> On Mon, Jan 27, 2020 at 09:34:46AM -0800, Roman Gushchin wrote:
> > This is fairly big but mostly red patch, which makes all non-root
> > slab allocations use a single set of kmem_caches instead of
> > creating a separate set for each memory cgroup.
> > 
> > Because the number of non-root kmem_caches is now capped by the number
> > of root kmem_caches, there is no need to shrink or destroy them
> > prematurely. They can be perfectly destroyed together with their
> > root counterparts. This allows to dramatically simplify the
> > management of non-root kmem_caches and delete a ton of code.
> 
> This is definitely going in the right direction. But it doesn't quite
> explain why we still need two sets of kmem_caches?
> 
> In the old scheme, we had completely separate per-cgroup caches with
> separate slab pages. If a cgrouped process wanted to allocate a slab
> object, we'd go to the root cache and used the cgroup id to look up
> the right cgroup cache. On slab free we'd use page->slab_cache.
> 
> Now we have slab pages that have a page->objcg array. Why can't all
> allocations go through a single set of kmem caches? If an allocation
> is coming from a cgroup and the slab page the allocator wants to use
> doesn't have an objcg array yet, we can allocate it on the fly, no?

Well, arguably it can be done, but there are few drawbacks:

1) On the release path you'll need to make some extra work even for
   root allocations: calculate the offset only to find the NULL objcg pointer.

2) There will be a memory overhead for root allocations
   (which might or might not be compensated by the increase
   of the slab utilization).

3) I'm working on percpu memory accounting that resembles the same scheme,
   except that obj_cgroups vector is created for the whole percpu block.
   There will be root- and memcg-blocks, and it will be expensive to merge them.
   I kinda like using the same scheme here and there.

Upsides?

1) slab utilization might increase a little bit (but I doubt it will have
   a huge effect, because both merging sets should be relatively big and well
   utilized)
2) eliminate memcg kmem_cache dynamic creation/destruction. it's nice,
   but there isn't so much code left anyway.


So IMO it's an interesting direction to explore, but not something
that necessarily has to be done in the context of this patchset.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ