lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.2601111836250.30566@angie.orcam.me.uk>
Date: Sun, 11 Jan 2026 21:21:57 +0000 (GMT)
From: "Maciej W. Rozycki" <macro@...am.me.uk>
To: Andrew Morton <akpm@...ux-foundation.org>, Jens Axboe <axboe@...nel.dk>, 
    John Garry <john.g.garry@...cle.com>, Su Hui <suhui@...china.com>, 
    "Martin K. Petersen" <martin.petersen@...cle.com>
cc: David Laight <david.laight.linux@...il.com>, linux-kernel@...r.kernel.org
Subject: [PATCH] linux/log2.h: Reduce instruction count for is_power_of_2()

Follow an observation that (n ^ (n - 1)) will only ever retain the most 
significant bit set in the word operated on if that is the only bit set 
in the first place, and use it to determine whether a number is a whole 
power of 2, avoiding the need for an explicit check for nonzero.

This reduces the sequence produced to 3 instructions only across Alpha, 
MIPS, and RISC-V targets, down from 4, 5, and 4 respectively, removing a 
branch in the two latter cases.  And it's 5 instructions on POWER and 
x86-64 vs 8 and 9 respectively.  There are no branches now emitted here 
for targets that have a suitable conditional set operation, although an 
inline expansion will often end with one, depending on what code a call 
to this function is used in.

Credit goes to GCC authors for coming up with this optimisation used as 
the fallback for (__builtin_popcountl(n) == 1), equivalent to this code, 
for targets where the hardware population count operation is considered 
expensive.

Signed-off-by: Maciej W. Rozycki <macro@...am.me.uk>
---
Hi,

 As discussed here[1] a further improvement might be possible with targets 
that support a population count operation in hardware, but care needs to 
be taken for a libcall not to be produced instead, so such code would have 
to be conditionalised on the presence of a population count instruction 
and its correct handling in the compiler (which GCC gets right for POWER, 
but wrong for Alpha, apparently owing to an incorrect cost taken for the 
operation).

 Anyway, this seems like worthwhile if small an improvement regardless, so 
please apply.

References:

[1] "treewide, bits: use ffs_val() where it is open-coded", 
    <https://lore.kernel.org/r/alpine.DEB.2.21.2601111450500.30566@angie.orcam.me.uk/>.

  Maciej
---
 include/linux/log2.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

linux-log2-pow2-opt.diff
Index: linux-macro/include/linux/log2.h
===================================================================
--- linux-macro.orig/include/linux/log2.h
+++ linux-macro/include/linux/log2.h
@@ -44,7 +44,7 @@ int __ilog2_u64(u64 n)
 static __always_inline __attribute__((const))
 bool is_power_of_2(unsigned long n)
 {
-	return (n != 0 && ((n & (n - 1)) == 0));
+	return n - 1 < (n ^ (n - 1));
 }
 
 /**

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ