lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 24 Jul 2022 00:15:19 +0900 From: Vincent Mailhol <mailhol.vincent@...adoo.fr> To: Nick Desaulniers <ndesaulniers@...gle.com>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, x86@...nel.org, Peter Zijlstra <peterz@...radead.org> Cc: Dave Hansen <dave.hansen@...ux.intel.com>, "H . Peter Anvin" <hpa@...or.com>, Nathan Chancellor <nathan@...nel.org>, Tom Rix <trix@...hat.com>, linux-kernel@...r.kernel.org, llvm@...ts.linux.dev, David Howells <dhowells@...hat.com>, Jan Beulich <JBeulich@...e.com>, Christophe Jaillet <christophe.jaillet@...adoo.fr>, Joe Perches <joe@...ches.com>, Josh Poimboeuf <jpoimboe@...nel.org>, Vincent Mailhol <mailhol.vincent@...adoo.fr> Subject: [RESEND PATCH v4 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions The compilers provide some builtin expression equivalent to the ffs(), __ffs() and ffz() function of the kernel. The kernel uses optimized assembly which produces better code than the builtin functions. However, such assembly code can not be optimized when used on constant expression. This series relies on __builtin_constant_p to select the optimal solution: * use kernel assembly for non constant expressions * use compiler's __builtin function for constant expressions. ** Statistics ** Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% of __ffs() and ffz() calls (details of the calculation in each patch). ** Changelog ** v3 -> v4: * (no changes on code, only commit comment was modified) * Remove note and link to Nick's message in patch 1/2, c.f.: https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/ * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> in tag in patch 2/2. v2 -> v3: * Redacted out the instructions after ret and before next function in the assembly output. * Added a note and a link to Nick's message on the constant propagation missed-optimization in clang: https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/ * Fix copy/paste typo in statistics of patch 1/2. Number of occurences before patches are 1081 and not 3607 (percentage reduction of 26.7% remains correct) * Rename the functions as follow: - __varible_ffs() -> variable___ffs() - __variable_ffz() -> variable_ffz() * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> in tag in patch 1/2. Vincent Mailhol (2): x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant expressions x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant expressions arch/x86/include/asm/bitops.h | 64 +++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 26 deletions(-) -- 2.35.1
Powered by blists - more mailing lists