lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMZ6RqL-GBDW1f4=eE_N+jaBjSnD7A8kG_XsNjyvcp94M2Ok3g@mail.gmail.com>
Date:   Mon, 23 May 2022 18:22:54 +0900
From:   Vincent MAILHOL <mailhol.vincent@...adoo.fr>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org
Cc:     Dave Hansen <dave.hansen@...ux.intel.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Nathan Chancellor <nathan@...nel.org>,
        Tom Rix <trix@...hat.com>, linux-kernel@...r.kernel.org,
        llvm@...ts.linux.dev, David Howells <dhowells@...hat.com>,
        Jan Beulich <JBeulich@...e.com>,
        Christophe JAILLET <christophe.jaillet@...adoo.fr>,
        Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH v4 0/2] x86/asm/bitops: optimize ff{s,z} functions for
 constant expressions

On Thu. 12 May 2022 at 10:20, Vincent Mailhol
<mailhol.vincent@...adoo.fr> wrote:
> The compilers provide some builtin expression equivalent to the ffs(),
> __ffs() and ffz() function of the kernel. The kernel uses optimized
> assembly which produces better code than the builtin
> functions. However, such assembly code can not be optimized when used
> on constant expression.
>
> This series relies on __builtin_constant_p to select the optimal solution:
>
>   * use kernel assembly for non constant expressions
>
>   * use compiler's __builtin function for constant expressions.
>
> I also think that the fls() and fls64() can be optimized in a similar
> way, using __builtin_ctz() and __builtin_ctzll() but it is a bit less
> trivial so I want to focus on this series first. If it get accepted, I
> will then work on those two additionnal function.
>
>
> ** Statistics **
>
> Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9%
> of __ffs() and ffz() calls (details of the calculation in each patch).
>
>
> ** Changelog **
>
> v3 -> v4:
>
>   * (no changes on code, only commit comment was modified)
>
>   * Remove note and link to Nick's message in patch 1/2, c.f.:
>   https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/
>
>   * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> in tag in patch 2/2.
>
>
> v2 -> v3:
>
>   * Redacted out the instructions after ret and before next function
>     in the assembly output.
>
>   * Added a note and a link to Nick's message on the constant
>     propagation missed-optimization in clang:
>     https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/
>
>   * Fix copy/paste typo in statistics of patch 1/2. Number of
>     occurences before patches are 1081 and not 3607 (percentage
>     reduction of 26.7% remains correct)
>
>   * Rename the functions as follow:
>     - __varible_ffs() -> variable___ffs()
>     - __variable_ffz() -> variable_ffz()
>
>   * Add Reviewed-by: Nick Desaulniers <ndesaulniers@...gle.com> in tag in patch 1/2.
>
> Vincent Mailhol (2):
>   x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant
>     expressions
>   x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant
>     expressions

Hi Thomas, Ingo and Borislav,

Are there any chances for you to pick those two patches during this
week's merge windows?
https://lore.kernel.org/all/20220512011855.1189653-2-mailhol.vincent@wanadoo.fr/
https://lore.kernel.org/all/20220512011855.1189653-3-mailhol.vincent@wanadoo.fr/

Thank you!


Yours sincerely,
Vincent Mailhol

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ