linux-kernel - Re: [RFC] mm/vmscan: add periodic slab shrinker

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ilrn5ttl.fsf@stepbren-lnx.us.oracle.com>
Date:   Tue, 05 Apr 2022 10:22:14 -0700
From:   Stephen Brennan <stephen.s.brennan@...cle.com>
To:     Dave Chinner <david@...morbit.com>, Hillf Danton <hdanton@...a.com>
Cc:     Roman Gushchin <roman.gushchin@...ux.dev>, MM <linux-mm@...ck.org>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Yu Zhao <yuzhao@...gle.com>,
        David Hildenbrand <david@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] mm/vmscan: add periodic slab shrinker

Dave Chinner <david@...morbit.com> writes:
[snip]
> If the cache keeps growing, then it's objects are being repeatedly
> referenced and it *should* keep growing. If it's one-off objects
> that are causing the growth of the cache and so objects are being
> reclaimed by the shrinker, then matching the periodic shrink scan to
> the growth rate will substantially reduce the rate of growth of that
> cache.

I can't speak for every slab cache, but I've been coming to the same
conclusion myself regarding the dentry cache. I think that the rate of
stepping through the LRU should be tied to the rate of allocations.
Truly in-use objects shouldn't be harmed by this, as they should get
referenced and rotated to the beginning of the LRU. But the one-offs
which are bloating the cache will be found and removed.

My dentry-related patch here [1] does tie the reclaim to the rate of
allocations. In that patch, I looked for sibling negative dentries to
reclaim, which is just silly in hindsight :)

I've implemented a version of this patch which just takes one step
through the LRU on each d_alloc. It's quite interesting to watch it
work. You can create 5 million negative dentries in directory /A via
stat(), and then create 5 million negative dentries in directory /B. The
total dentry slab size reaches 5 million but never goes past it, since
the old negative dentries from /A aren't really in use, and they get
pruned at the same rate as negative dentries from /B get created. On the
other hand, if you *continue* to stat() on the dentries of /A while you
create negative dentries in /B, then the cache grows to 10 million,
since the /A dentries are actually in use.

Maybe a solution could involve some generic list_lru machinery that can
let you do these sorts of incremental scans? Maybe batching them up so
instead of doing one every allocation, you do them every 100 or 1000?
It would still be up to the individual user to put this to good use in
the object allocation path.

Thanks,
Stephen

[1] https://lore.kernel.org/linux-fsdevel/20220209231406.187668-1-stephen.s.brennan@oracle.com/