linux-kernel - Re: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f1e43327-30ce-4a73-8a24-c813a516f97f@suse.cz>
Date: Wed, 17 Jan 2024 22:19:48 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
 Josh Poimboeuf <jpoimboe@...nel.org>
Cc: Jeff Layton <jlayton@...nel.org>, Chuck Lever <chuck.lever@...cle.com>,
 Shakeel Butt <shakeelb@...gle.com>, Roman Gushchin
 <roman.gushchin@...ux.dev>, Johannes Weiner <hannes@...xchg.org>,
 Michal Hocko <mhocko@...nel.org>, linux-kernel@...r.kernel.org,
 Jens Axboe <axboe@...nel.dk>, Tejun Heo <tj@...nel.org>,
 Vasily Averin <vasily.averin@...ux.dev>, Michal Koutny <mkoutny@...e.com>,
 Waiman Long <longman@...hat.com>, Muchun Song <muchun.song@...ux.dev>,
 Jiri Kosina <jikos@...nel.org>, cgroups@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again

On 1/17/24 21:20, Linus Torvalds wrote:
> On Wed, 17 Jan 2024 at 11:39, Josh Poimboeuf <jpoimboe@...nel.org> wrote:
>>
>> That's a good point.  If the microbenchmark isn't likely to be even
>> remotely realistic, maybe we should just revert the revert until if/when
>> somebody shows a real world impact.
>>
>> Linus, any objections to that?
> 
> We use SLAB_ACCOUNT for much more common allocations like queued
> signals, so I would tend to agree with Jeff that it's probably just
> some not very interesting microbenchmark that shows any file locking
> effects from SLAB_ALLOC, not any real use.
> 
> That said, those benchmarks do matter. It's very easy to say "not
> relevant in the big picture" and then the end result is that
> everything is a bit of a pig.
> 
> And the regression was absolutely *ENORMOUS*. We're not talking "a few
> percent". We're talking a 33% regression that caused the revert:
> 
>    https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex-9020/
> 
> I wish our SLAB_ACCOUNT wasn't such a pig. Rather than account every
> single allocation, it would be much nicer to account at a bigger
> granularity, possibly by having per-thread counters first before
> falling back to the obj_cgroup_charge. Whatever.

Counters are one thing (afaik some batching happens on the memcg side via
"stocks"), but another is associating the memcg with the allocated objects
in slab pages, so kmem_cache_free() knows which counter to decrement. We'll
have to see where the overhead is today.

If there's overhead due to calls between mm/slub.c and mm/memcontrol.c we
can now reduce that with SLAB gone.

> It's kind of stupid to have a benchmark that just allocates and
> deallocates a file lock in quick succession spend lots of time
> incrementing and decrementing cgroup charges for that repeated
> alloc/free.
> 
> However, that problem with SLAB_ACCOUNT is not the fault of file
> locking, but more of a slab issue.
> 
> End result: I think we should bring in Vlastimil and whoever else is
> doing SLAB_ACCOUNT things, and have them look at that side.

Roman and Shakeel are already Cc'd. Roman recently did
https://lore.kernel.org/lkml/20231019225346.1822282-1-roman.gushchin@linux.dev/
which is mentioned in the cover letter and was merged in 6.7, but cover says
it didn't help much, too bad. So is it still 33% or how much?

> And then just enable SLAB_ACCOUNT for file locks. But very much look
> at silly costs in SLAB_ACCOUNT first, at least for trivial
> "alloc/free" patterns..
> 
> Vlastimil? Who would be the best person to look at that SLAB_ACCOUNT
> thing? See commit 3754707bcc3e (Revert "memcg: enable accounting for
> file lock caches") for the history here.
> 
>                  Linus