linux-kernel - Re: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <8fc59462-2940-4e60-95f1-2955a8c24ea0@suse.cz>
Date: Tue, 30 Jan 2024 12:04:57 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
 Shakeel Butt <shakeelb@...gle.com>
Cc: Roman Gushchin <roman.gushchin@...ux.dev>,
 Josh Poimboeuf <jpoimboe@...nel.org>, Jeff Layton <jlayton@...nel.org>,
 Chuck Lever <chuck.lever@...cle.com>, Johannes Weiner <hannes@...xchg.org>,
 Michal Hocko <mhocko@...nel.org>, linux-kernel@...r.kernel.org,
 Jens Axboe <axboe@...nel.dk>, Tejun Heo <tj@...nel.org>,
 Vasily Averin <vasily.averin@...ux.dev>, Michal Koutny <mkoutny@...e.com>,
 Waiman Long <longman@...hat.com>, Muchun Song <muchun.song@...ux.dev>,
 Jiri Kosina <jikos@...nel.org>, cgroups@...r.kernel.org, linux-mm@...ck.org,
 Howard McLauchlan <hmclauchlan@...com>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again

On 1/26/24 10:50, Vlastimil Babka wrote:
> On 1/22/24 06:10, Linus Torvalds wrote:
>> On Wed, 17 Jan 2024 at 14:56, Shakeel Butt <shakeelb@...gle.com> wrote:
>>> >
>>> > So I don't see how we can make it really cheap (say, less than 5% overhead)
>>> > without caching pre-accounted objects.
>>>
>>> Maybe this is what we want. Now we are down to just SLUB, maybe such
>>> caching of pre-accounted objects can be in SLUB layer and we can
>>> decide to keep this caching per-kmem-cache opt-in or always on.
>> 
>> So it turns out that we have another case of SLAB_ACCOUNT being quite
>> a big expense, and it's actually the normal - but failed - open() or
>> execve() case.
>> 
>> See the thread at
>> 
>>     https://lore.kernel.org/all/CAHk-=whw936qzDLBQdUz-He5WK_0fRSWwKAjtbVsMGfX70Nf_Q@mail.gmail.com/
>> 
>> and to see the effect in profiles, you can use this EXTREMELY stupid
>> test program:
>> 
>>     #include <fcntl.h>
>> 
>>     int main(int argc, char **argv)
>>     {
>>         for (int i = 0; i < 10000000; i++)
>>                 open("nonexistent", O_RDONLY);
>>     }
> 
> This reminded me I can see should_failslab() in the profiles (1.43% plus the
> overhead in its caller) even if it does nothing at all, and it's completely
> unconditional since commit 4f6923fbb352 ("mm: make should_failslab always
> available for fault injection").
> 
> We discussed it briefly when Jens tried to change it in [1] to depend on
> CONFIG_FAILSLAB again. But now I think it should be even possible to leave
> it always available, but behind a static key. BPF or whoever else uses these
> error injection hooks would have to track how many users of them are active
> and manage the static key accordingly. Then it could be always available,
> but have no overhead when there's no active user? Other similars hooks could
> benefit from such an approach too?
> 
> [1]
> https://lore.kernel.org/all/e01e5e40-692a-519c-4cba-e3331f173c82@kernel.dk/#t

Just for completeness, with the hunk below I've seen some 2% improvement on
the test program from Linus.
Of course something needs to operate the static key and I'm not familiar
enough with bpf and related machinery whether it's tracking users of the
injection points already where the static key toggling could be hooked.

diff --git a/mm/slub.c b/mm/slub.c
index 2ef88bbf56a3..da07b358d092 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3750,6 +3750,8 @@ noinline int should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 }
 ALLOW_ERROR_INJECTION(should_failslab, ERRNO);
 
+static DEFINE_STATIC_KEY_FALSE(failslab_enabled);
+
 static __fastpath_inline
 struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
                                       struct list_lru *lru,
@@ -3760,8 +3762,10 @@ struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
 
        might_alloc(flags);
 
-       if (unlikely(should_failslab(s, flags)))
-               return NULL;
+       if (static_branch_unlikely(&failslab_enabled)) {
+               if (unlikely(should_failslab(s, flags)))
+                       return NULL;
+       }
 
        if (unlikely(!memcg_slab_pre_alloc_hook(s, lru, objcgp, size, flags)))
                return NULL;