[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKwvOdnr_F9-voPj4cp2HG8=U32a8Hp1aLpynSQiKOrwe4txpQ@mail.gmail.com>
Date: Tue, 6 Sep 2022 11:26:31 -0700
From: Nick Desaulniers <ndesaulniers@...gle.com>
To: Vincent Mailhol <mailhol.vincent@...adoo.fr>
Cc: Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
Peter Zijlstra <peterz@...radead.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H . Peter Anvin" <hpa@...or.com>,
Nathan Chancellor <nathan@...nel.org>,
Tom Rix <trix@...hat.com>, linux-kernel@...r.kernel.org,
llvm@...ts.linux.dev, David Howells <dhowells@...hat.com>,
Jan Beulich <JBeulich@...e.com>,
Christophe Jaillet <christophe.jaillet@...adoo.fr>,
Joe Perches <joe@...ches.com>,
Josh Poimboeuf <jpoimboe@...nel.org>,
Yury Norov <yury.norov@...il.com>
Subject: Re: [PATCH v7 0/2] x86/asm/bitops: optimize ff{s,z} functions for
constant expressions
On Sun, Sep 4, 2022 at 5:38 PM Vincent Mailhol
<mailhol.vincent@...adoo.fr> wrote:
>
> The compilers provide some builtin expression equivalent to the ffs(),
> __ffs() and ffz() functions of the kernel. The kernel uses optimized
> assembly which produces better code than the builtin
> functions. However, such assembly code can not be folded when used
> with constant expressions.
Another tact which may help additional sources other than just the
Linux kernel; it seems that compilers should be able to fold this.
Vincent, if you're interested in making such an optimization in LLVM,
we'd welcome the contribution, and I'd be happy to show you where to
make such changes within LLVM; please let me know off thread.
If not, at the least, we should file feature requests in both:
* https://github.com/llvm/llvm-project/issues
* https://gcc.gnu.org/bugzilla/
>
> This series relies on __builtin_constant_p to select the optimal solution:
>
> * use kernel assembly for non constant expressions
>
> * use compiler's __builtin function for constant expressions.
>
>
> ** Statistics **
>
> Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9%
> of __ffs() and ffz() calls (details of the calculation in each patch).
>
>
> ** Changelog **
>
> v6 -> v7:
>
> * (no changes on code, only commit tag was modified)
>
> * Add Reviewed-by: Yury Norov <yury.norov@...il.com> in both patches
>
>
> v5 -> v6:
> * Rename variable___ffs() into variable__ffs() (two underscores
> instead of three)
>
>
> v4 -> v5:
>
> * (no changes on code, only commit comment was modified)
>
> * Rewrite the commit log:
> - Use two spaces instead of `| ' to indent code snippets.
> - Do not use `we'.
> - Do not use `this patch' in the commit description. Instead,
> use imperative tone.
> Link: https://lore.kernel.org/all/YvUZVYxbOMcZtR5G@zn.tnic/
>
>
> v3 -> v4:
>
> * (no changes on code, only commit comment was modified)
>
> * Remove note and link to Nick's message in patch 1/2, c.f.:
> Link: https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/
>
> * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> tag in
> patch 2/2.
>
>
> v2 -> v3:
>
> * Redacted out the instructions after ret and before next function
> in the assembly output.
>
> * Added a note and a link to Nick's message on the constant
> propagation missed-optimization in clang:
> Link: https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/
>
> * Fix copy/paste typo in statistics of patch 1/2. Number of
> occurences before patches are 1081 and not 3607 (percentage
> reduction of 26.7% remains correct)
>
> * Rename the functions as follow:
> - __varible_ffs() -> variable___ffs()
> - __variable_ffz() -> variable_ffz()
>
> * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> tag in
> patch 1/2.
>
>
> Vincent Mailhol (2):
> x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant
> expressions
> x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant
> expressions
>
> arch/x86/include/asm/bitops.h | 64 +++++++++++++++++++++--------------
> 1 file changed, 38 insertions(+), 26 deletions(-)
>
> --
> 2.35.1
>
--
Thanks,
~Nick Desaulniers
Powered by blists - more mailing lists