linux-kernel - Re: [PATCH] mm/slub: improve count_partial() for CONFIG_SLUB_CPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f50afb1b-bb63-eae9-4f8c-dbc5f678d43d@suse.cz>
Date:   Tue, 25 Feb 2020 16:49:20 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Roman Gushchin <guro@...com>, Christopher Lameter <cl@...ux.com>
Cc:     Wen Yang <wenyang@...ux.alibaba.com>,
        Pekka Enberg <penberg@...nel.org>,
        David Rientjes <rientjes@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Xunlei Pang <xlpang@...ux.alibaba.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/slub: improve count_partial() for
 CONFIG_SLUB_CPU_PARTIAL

On 2/24/20 5:57 PM, Roman Gushchin wrote:
> On Mon, Feb 24, 2020 at 01:29:09AM +0000, Christoph Lameter wrote:
>> On Sat, 22 Feb 2020, Wen Yang wrote:
> 
> Hello, Christopher!
> 
>>
>>> We also observed that in this scenario, CONFIG_SLUB_CPU_PARTIAL is turned
>>> on by default, and count_partial() is useless because the returned number
>>> is far from the reality.
>>
>> Well its not useless. Its just not counting the partial objects in per cpu
>> partial slabs. Those are counted by a different counter it.
> 
> Do you mean CPU_PARTIAL_ALLOC or something else?
> 
> "useless" isn't the most accurate wording, sorry for that.
> 
> The point is that the number of active objects displayed in /proc/slabinfo
> is misleading if percpu partial lists are used. So it's strange to pay
> for it by potentially slowing down concurrent allocations.

Hmm, I wonder... kmem_cache_cpu has those quite detailed stats with
CONFIG_SLUB_STATS. Could perhaps the number of free object be
reconstructed from them by adding up / subtracting the relevant items
across all CPUs? Expensive, but the cost would be taken by the
/proc/slabinfo reader, without blocking anyone.

Then again, CONFIG_SLUB_STATS is disabled by default. But the same
percpu mechanism could be used to create some "stats light" variant that
doesn't count everything, just what's needed to track number of free
objects. Percpu should mean the atomic inc/decs wouldn't cause much
contention...

It's certainly useful to have an idea of slab fragmentation (low inuse
vs total object) from /proc/slabinfo. But if that remains available via
/sys/kernel/slab/ then I guess it's fine... until all continuous
monitoring tools that now read /proc/slabinfo periodically start reading
all those /sys/kernel/slab/ files periodically...