lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 16 Jun 2020 20:32:17 -0700
From:   Roman Gushchin <guro@...com>
To:     Shakeel Butt <shakeelb@...gle.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Christoph Lameter <cl@...ux.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Linux MM <linux-mm@...ck.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Kernel Team <kernel-team@...com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6 00/19] The new cgroup slab memory controller

On Tue, Jun 16, 2020 at 08:05:39PM -0700, Shakeel Butt wrote:
> On Tue, Jun 16, 2020 at 7:41 PM Roman Gushchin <guro@...com> wrote:
> >
> > On Tue, Jun 16, 2020 at 06:46:56PM -0700, Shakeel Butt wrote:
> > > On Mon, Jun 8, 2020 at 4:07 PM Roman Gushchin <guro@...com> wrote:
> > > >
> [...]
> > >
> > > Have you performed any [perf] testing on SLAB with this patchset?
> >
> > The accounting part is the same for SLAB and SLUB, so there should be no
> > significant difference. I've checked that it compiles, boots and passes
> > kselftests. And that memory savings are there.
> >
> 
> What about performance? Also you mentioned that sharing kmem-cache
> between accounted and non-accounted can have additional overhead. Any
> difference between SLAB and SLUB for such a case?

Not really.

Sharing a single set of caches adds some overhead to root- and non-accounted
allocations, which is something I've tried hard to avoid in my original version.
But I have to admit, it allows to simplify and remove a lot of code, and here
it's hard to argue with Johanness, who pushed on this design.

With performance testing it's not that easy, because it's not obvious what
we wanna test. Obviously, per-object accounting is more expensive, and
measuring something like 1000000 allocations and deallocations in a line from
a single kmem_cache will show a regression. But in the real world the relative
cost of allocations is usually low, and we can get some benefits from a smaller
working set and from having shared kmem_cache objects cache hot.
Not speaking about some extra memory and the fragmentation reduction.

We've done an extensive testing of the original version in Facebook production,
and we haven't noticed any regressions so far. But I have to admit, we were
using an original version with two sets of kmem_caches.

If you have any specific tests in mind, I can definitely run them. Or if you
can help with the performance evaluation, I'll appreciate it a lot.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ