linux-kernel - Re: [PATCH v3 0/4] mm/slub: Fix count

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YE+wBMuX1Q0rhPQj@carbon.dhcp.thefacebook.com>
Date:   Mon, 15 Mar 2021 12:05:40 -0700
From:   Roman Gushchin <guro@...com>
To:     Vlastimil Babka <vbabka@...e.cz>
CC:     Xunlei Pang <xlpang@...ux.alibaba.com>,
        Christoph Lameter <cl@...ux.com>,
        Pekka Enberg <penberg@...nel.org>,
        Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
        David Rientjes <rientjes@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Shu Ming <sming56@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
        Wen Yang <wenyang@...ux.alibaba.com>,
        James Wang <jnwang@...ux.alibaba.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v3 0/4] mm/slub: Fix count_partial() problem


On Mon, Mar 15, 2021 at 07:49:57PM +0100, Vlastimil Babka wrote:
> On 3/9/21 4:25 PM, Xunlei Pang wrote:
> > count_partial() can hold n->list_lock spinlock for quite long, which
> > makes much trouble to the system. This series eliminate this problem.
> 
> Before I check the details, I have two high-level comments:
> 
> - patch 1 introduces some counting scheme that patch 4 then changes, could we do
> this in one step to avoid the churn?
> 
> - the series addresses the concern that spinlock is being held, but doesn't
> address the fact that counting partial per-node slabs is not nearly enough if we
> want accurate <active_objs> in /proc/slabinfo because there are also percpu
> slabs and per-cpu partial slabs, where we don't track the free objects at all.
> So after this series while the readers of /proc/slabinfo won't block the
> spinlock, they will get the same garbage data as before. So Christoph is not
> wrong to say that we can just report active_objs == num_objs and it won't
> actually break any ABI.
> At the same time somebody might actually want accurate object statistics at the
> expense of peak performance, and it would be nice to give them such option in
> SLUB. Right now we don't provide this accuracy even with CONFIG_SLUB_STATS,
> although that option provides many additional tuning stats, with additional
> overhead.
> So my proposal would be a new config for "accurate active objects" (or just tie
> it to CONFIG_SLUB_DEBUG?) that would extend the approach of percpu counters in
> patch 4 to all alloc/free, so that it includes percpu slabs. Without this config
> enabled, let's just report active_objs == num_objs.

It sounds really good to me! The only thing, I'd avoid introducing a new option
and use CONFIG_SLUB_STATS instead.

It seems like CONFIG_SLUB_DEBUG is a more popular option than CONFIG_SLUB_STATS.
CONFIG_SLUB_DEBUG is enabled on my Fedora workstation, CONFIG_SLUB_STATS is off.
I doubt an average user needs this data, so I'd go with CONFIG_SLUB_STATS.

Thanks!

> 
> Vlastimil
> 
> > v1->v2:
> > - Improved changelog and variable naming for PATCH 1~2.
> > - PATCH3 adds per-cpu counter to avoid performance regression
> >   in concurrent __slab_free().
> > 
> > v2->v3:
> > - Changed "page->inuse" to the safe "new.inuse", etc.
> > - Used CONFIG_SLUB_DEBUG and CONFIG_SYSFS condition for new counters.
> > - atomic_long_t -> unsigned long
> > 
> > [Testing]
> > There seems might be a little performance impact under extreme
> > __slab_free() concurrent calls according to my tests.
> > 
> > On my 32-cpu 2-socket physical machine:
> > Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
> > 
> > 1) perf stat --null --repeat 10 -- hackbench 20 thread 20000
> > 
> > == original, no patched
> > Performance counter stats for 'hackbench 20 thread 20000' (10 runs):
> > 
> >       24.536050899 seconds time elapsed                                          ( +-  0.24% )
> > 
> > 
> > Performance counter stats for 'hackbench 20 thread 20000' (10 runs):
> > 
> >       24.588049142 seconds time elapsed                                          ( +-  0.35% )
> > 
> > 
> > == patched with patch1~4
> > Performance counter stats for 'hackbench 20 thread 20000' (10 runs):
> > 
> >       24.670892273 seconds time elapsed                                          ( +-  0.29% )
> > 
> > 
> > Performance counter stats for 'hackbench 20 thread 20000' (10 runs):
> > 
> >       24.746755689 seconds time elapsed                                          ( +-  0.21% )
> > 
> > 
> > 2) perf stat --null --repeat 10 -- hackbench 32 thread 20000
> > 
> > == original, no patched
> >  Performance counter stats for 'hackbench 32 thread 20000' (10 runs):
> > 
> >       39.784911855 seconds time elapsed                                          ( +-  0.14% )
> > 
> >  Performance counter stats for 'hackbench 32 thread 20000' (10 runs):
> > 
> >       39.868687608 seconds time elapsed                                          ( +-  0.19% )
> > 
> > == patched with patch1~4
> >  Performance counter stats for 'hackbench 32 thread 20000' (10 runs):
> > 
> >       39.681273015 seconds time elapsed                                          ( +-  0.21% )
> > 
> >  Performance counter stats for 'hackbench 32 thread 20000' (10 runs):
> > 
> >       39.681238459 seconds time elapsed                                          ( +-  0.09% )
> > 
> > 
> > Xunlei Pang (4):
> >   mm/slub: Introduce two counters for partial objects
> >   mm/slub: Get rid of count_partial()
> >   percpu: Export per_cpu_sum()
> >   mm/slub: Use percpu partial free counter
> > 
> >  include/linux/percpu-defs.h   |  10 ++++
> >  kernel/locking/percpu-rwsem.c |  10 ----
> >  mm/slab.h                     |   4 ++
> >  mm/slub.c                     | 120 +++++++++++++++++++++++++++++-------------
> >  4 files changed, 97 insertions(+), 47 deletions(-)
> > 
>