linux-kernel - Re: [PATCH v6 00/19] The new cgroup slab memory controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200617033217.GE10812@carbon.lan>
Date:   Tue, 16 Jun 2020 20:32:17 -0700
From:   Roman Gushchin <guro@...com>
To:     Shakeel Butt <shakeelb@...gle.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Christoph Lameter <cl@...ux.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Linux MM <linux-mm@...ck.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Kernel Team <kernel-team@...com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6 00/19] The new cgroup slab memory controller

On Tue, Jun 16, 2020 at 08:05:39PM -0700, Shakeel Butt wrote:
> On Tue, Jun 16, 2020 at 7:41 PM Roman Gushchin <guro@...com> wrote:
> >
> > On Tue, Jun 16, 2020 at 06:46:56PM -0700, Shakeel Butt wrote:
> > > On Mon, Jun 8, 2020 at 4:07 PM Roman Gushchin <guro@...com> wrote:
> > > >
> [...]
> > >
> > > Have you performed any [perf] testing on SLAB with this patchset?
> >
> > The accounting part is the same for SLAB and SLUB, so there should be no
> > significant difference. I've checked that it compiles, boots and passes
> > kselftests. And that memory savings are there.
> >
> 
> What about performance? Also you mentioned that sharing kmem-cache
> between accounted and non-accounted can have additional overhead. Any
> difference between SLAB and SLUB for such a case?

Not really.

Sharing a single set of caches adds some overhead to root- and non-accounted
allocations, which is something I've tried hard to avoid in my original version.
But I have to admit, it allows to simplify and remove a lot of code, and here
it's hard to argue with Johanness, who pushed on this design.

With performance testing it's not that easy, because it's not obvious what
we wanna test. Obviously, per-object accounting is more expensive, and
measuring something like 1000000 allocations and deallocations in a line from
a single kmem_cache will show a regression. But in the real world the relative
cost of allocations is usually low, and we can get some benefits from a smaller
working set and from having shared kmem_cache objects cache hot.
Not speaking about some extra memory and the fragmentation reduction.

We've done an extensive testing of the original version in Facebook production,
and we haven't noticed any regressions so far. But I have to admit, we were
using an original version with two sets of kmem_caches.

If you have any specific tests in mind, I can definitely run them. Or if you
can help with the performance evaluation, I'll appreciate it a lot.

Thanks!