[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjMKONPsXAJ=yJuPBEAx6HdYRkYE8TdYVBvpm3=x_EnCw@mail.gmail.com>
Date: Sun, 26 Nov 2023 12:23:57 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: kernel test robot <oliver.sang@...el.com>
Cc: Christian Brauner <brauner@...nel.org>, oe-lkp@...ts.linux.dev,
lkp@...el.com, linux-kernel@...r.kernel.org,
Jann Horn <jannh@...gle.com>, linux-doc@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, intel-gfx@...ts.freedesktop.org,
linux-fsdevel@...r.kernel.org, gfs2@...ts.linux.dev,
bpf@...r.kernel.org, ying.huang@...el.com, feng.tang@...el.com,
fengwei.yin@...el.com
Subject: Re: [linus:master] [file] 0ede61d858: will-it-scale.per_thread_ops
-2.9% regression
On Sun, 19 Nov 2023 at 23:11, kernel test robot <oliver.sang@...el.com> wrote:
>
> kernel test robot noticed a -2.9% regression of will-it-scale.per_thread_ops on:
>
> commit: 0ede61d8589cc2d93aa78230d74ac58b5b8d0244 ("file: convert to SLAB_TYPESAFE_BY_RCU")
Ok, so __fget_light() is one of our more performance-critical things,
and that commit ends up making it call a rather more expensive version
in __get_file_rcu(), so we have:
> 30.90 ą 4% -20.6 10.35 ą 2% perf-profile.self.cycles-pp.__fget_light
> 0.00 +26.5 26.48 perf-profile.self.cycles-pp.__get_file_rcu
and that "20% decrease balanced by 26% increase elsewhere" then
directly causes the ~3% regression.
I took a look at the code generation, and honestly, I think we're
better off just making __fget_files_rcu() have special logic for this
all, and not use __get_file_rcu().
The 'fd' case really is special because we need to do that
non-speculative pointer access.
Because it turns out that we already have to use array_index_nospec()
to safely generate that pointer to the fd entry, and once you have to
do that "use non-speculative accesses to generate a safe pointer", you
might as well just go whole hog.
So this takes a different approach, and uses the
array_index_mask_nospec() thing that we have exactly to generate that
non-speculative mask for these things:
/* Mask is a 0 for invalid fd's, ~0 for valid ones */
mask = array_index_mask_nospec(fd, fdt->max_fds);
and then it does something you can consider either horribly clever, or
horribly ugly (this first part is basically just
array_index_nospec()):
/* fdentry points to the 'fd' offset, or fdt->fd[0] */
fdentry = fdt->fd + (fd&mask);
and then we can just *unconditionally* do the load - but since we
might be loading fd[0] for an invalid case, we need to mask the result
too:
/* Do the load, then mask any invalid result */
file = rcu_dereference_raw(*fdentry);
file = (void *)(mask & (unsigned long)file);
but now we have done everything without any conditionals, and the only
conditional is now "did we load NULL" - which includes that "we masked
the bad value".
Then we just do that atomic_long_inc_not_zero() on the file, and
re-check the pointer chain we used.
I made files_lookup_fd_raw() do the same thing.
The end result is much nicer code generation at least from my quick
check. And I assume the regression will be gone, and hopefully even
turned into an improvement since this is so hot.
Comments? I also looked at that odd OPTIMIZER_HIDE_VAR() that
__get_file_rcu() does, and I don't get it. Both things come from
volatile accesses, I don't see the point of those games, but I also
didn't care, since it's no longer in a critical code path.
Christian?
NOTE! This patch is not well tested. I verified an earlier version of
this, but have been playing with it since, so caveat emptor.
IOW, I might have messed up some "trivial cleanup" when prepping for
sending it out...
Linus
View attachment "patch.diff" of type "text/x-patch" (4120 bytes)
Powered by blists - more mailing lists