[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMZ6RqL-GBDW1f4=eE_N+jaBjSnD7A8kG_XsNjyvcp94M2Ok3g@mail.gmail.com>
Date: Mon, 23 May 2022 18:22:54 +0900
From: Vincent MAILHOL <mailhol.vincent@...adoo.fr>
To: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org
Cc: Dave Hansen <dave.hansen@...ux.intel.com>,
"H . Peter Anvin" <hpa@...or.com>,
Nathan Chancellor <nathan@...nel.org>,
Tom Rix <trix@...hat.com>, linux-kernel@...r.kernel.org,
llvm@...ts.linux.dev, David Howells <dhowells@...hat.com>,
Jan Beulich <JBeulich@...e.com>,
Christophe JAILLET <christophe.jaillet@...adoo.fr>,
Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH v4 0/2] x86/asm/bitops: optimize ff{s,z} functions for
constant expressions
On Thu. 12 May 2022 at 10:20, Vincent Mailhol
<mailhol.vincent@...adoo.fr> wrote:
> The compilers provide some builtin expression equivalent to the ffs(),
> __ffs() and ffz() function of the kernel. The kernel uses optimized
> assembly which produces better code than the builtin
> functions. However, such assembly code can not be optimized when used
> on constant expression.
>
> This series relies on __builtin_constant_p to select the optimal solution:
>
> * use kernel assembly for non constant expressions
>
> * use compiler's __builtin function for constant expressions.
>
> I also think that the fls() and fls64() can be optimized in a similar
> way, using __builtin_ctz() and __builtin_ctzll() but it is a bit less
> trivial so I want to focus on this series first. If it get accepted, I
> will then work on those two additionnal function.
>
>
> ** Statistics **
>
> Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9%
> of __ffs() and ffz() calls (details of the calculation in each patch).
>
>
> ** Changelog **
>
> v3 -> v4:
>
> * (no changes on code, only commit comment was modified)
>
> * Remove note and link to Nick's message in patch 1/2, c.f.:
> https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/
>
> * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> in tag in patch 2/2.
>
>
> v2 -> v3:
>
> * Redacted out the instructions after ret and before next function
> in the assembly output.
>
> * Added a note and a link to Nick's message on the constant
> propagation missed-optimization in clang:
> https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/
>
> * Fix copy/paste typo in statistics of patch 1/2. Number of
> occurences before patches are 1081 and not 3607 (percentage
> reduction of 26.7% remains correct)
>
> * Rename the functions as follow:
> - __varible_ffs() -> variable___ffs()
> - __variable_ffz() -> variable_ffz()
>
> * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> in tag in patch 1/2.
>
> Vincent Mailhol (2):
> x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant
> expressions
> x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant
> expressions
Hi Thomas, Ingo and Borislav,
Are there any chances for you to pick those two patches during this
week's merge windows?
https://lore.kernel.org/all/20220512011855.1189653-2-mailhol.vincent@wanadoo.fr/
https://lore.kernel.org/all/20220512011855.1189653-3-mailhol.vincent@wanadoo.fr/
Thank you!
Yours sincerely,
Vincent Mailhol
Powered by blists - more mailing lists