[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260110105002.1067bf38@pumpkin>
Date: Sat, 10 Jan 2026 10:50:02 +0000
From: David Laight <david.laight.linux@...il.com>
To: Petr Tesarik <ptesarik@...e.com>
Cc: Yury Norov <yury.norov@...il.com>, Rasmus Villemoes
<linux@...musvillemoes.dk>, Richard Henderson
<richard.henderson@...aro.org>, Matt Turner <mattst88@...il.com>, Magnus
Lindholm <linmag7@...il.com>, Vineet Gupta <vgupta@...nel.org>, Geert
Uytterhoeven <geert@...ux-m68k.org>, "Maciej W. Rozycki"
<macro@...am.me.uk>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Madhavan Srinivasan <maddy@...ux.ibm.com>, Michael Ellerman
<mpe@...erman.id.au>, Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik
<gor@...ux.ibm.com>, Alexander Gordeev <agordeev@...ux.ibm.com>, Chris
Zankel <chris@...kel.net>, Max Filippov <jcmvbkbc@...il.com>, Patrik
Jakobsson <patrik.r.jakobsson@...il.com>, Maarten Lankhorst
<maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Robin Murphy <robin.murphy@....com>, Joerg
Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>, Jakub Kicinski
<kuba@...nel.org>, Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Paolo Abeni
<pabeni@...hat.com>, Oliver Neukum <oliver@...kum.org>, Arnd Bergmann
<arnd@...db.de>, Kuan-Wei Chiu <visitorckw@...il.com>, Andrew Morton
<akpm@...ux-foundation.org>, Marcel Holtmann <marcel@...tmann.org>, Johan
Hedberg <johan.hedberg@...il.com>, Luiz Augusto von Dentz
<luiz.dentz@...il.com>, Pablo Neira Ayuso <pablo@...filter.org>, Florian
Westphal <fw@...len.de>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 1/2] bits: introduce ffs_val()
On Fri, 9 Jan 2026 17:37:56 +0100
Petr Tesarik <ptesarik@...e.com> wrote:
> Introduce a macro that can efficiently extract the least significant
> non-zero bit from a value.
>
> Interestingly, this bit-twiddling trick is open-coded in some places, but
> it also appears to be little known, leading to various inefficient
> implementations in other places. Let's make it part of the standard bitops
> arsenal.
I'm not sure whether ffs_val(x) is actually more readable than an
open-coded (x & -x).
If you don't know what either means you have to look it up or work
it out.
The latter just requires a bit of thought, the former searching through
the source tree for the correct header and then believing the comment
or, again, working out what it does.
That said, I'm not objecting to adding it, but the churn of changing
existing code is probably not worth the effort.
I'd also define it as x & (~x + 1) - which makes it a lot more obvious
why it is correct, the compiler will convert it to a signed negate.
Also, as I pointed out earlier, many modern cpu have an instruction
for ffs(). While x & -x is usualy better than 1u << __ffs(x); the same
is not true for y * (x & -x) and y << __ffs(x).
In particular on Zen4/5 bsf (used for __ffs) has a latency of 1 but the
multiply has a latency of 3.
Intel mainstream x86 cpu all have latency 3 for both imul and bsf.
There should be #define definitions of is_power_of_2_or_zero() !(x + (x-1))
and is_power_of_2() (!x && is_power_of_2_or_zero(x)) in the same header.
But there is only an inline is_power_of_2(unsigned long) in log.h.
David
Powered by blists - more mailing lists