[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260109195946.551a2693@mordecai>
Date: Fri, 9 Jan 2026 19:59:46 +0100
From: Petr Tesarik <ptesarik@...e.com>
To: Kuan-Wei Chiu <visitorckw@...il.com>
Cc: Yury Norov <yury.norov@...il.com>, Rasmus Villemoes
<linux@...musvillemoes.dk>, Richard Henderson
<richard.henderson@...aro.org>, Matt Turner <mattst88@...il.com>, Magnus
Lindholm <linmag7@...il.com>, Vineet Gupta <vgupta@...nel.org>, Geert
Uytterhoeven <geert@...ux-m68k.org>, "Maciej W. Rozycki"
<macro@...am.me.uk>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Madhavan Srinivasan <maddy@...ux.ibm.com>, Michael Ellerman
<mpe@...erman.id.au>, Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik
<gor@...ux.ibm.com>, Alexander Gordeev <agordeev@...ux.ibm.com>, Chris
Zankel <chris@...kel.net>, Max Filippov <jcmvbkbc@...il.com>, Patrik
Jakobsson <patrik.r.jakobsson@...il.com>, Maarten Lankhorst
<maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Robin Murphy <robin.murphy@....com>, Joerg
Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>, Jakub Kicinski
<kuba@...nel.org>, Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Paolo Abeni
<pabeni@...hat.com>, Oliver Neukum <oliver@...kum.org>, Arnd Bergmann
<arnd@...db.de>, Andrew Morton <akpm@...ux-foundation.org>, Marcel Holtmann
<marcel@...tmann.org>, Johan Hedberg <johan.hedberg@...il.com>, Luiz
Augusto von Dentz <luiz.dentz@...il.com>, Pablo Neira Ayuso
<pablo@...filter.org>, Florian Westphal <fw@...len.de>,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/2] Helper to isolate least-significant bit
On Sat, 10 Jan 2026 01:20:59 +0800
Kuan-Wei Chiu <visitorckw@...il.com> wrote:
> Hi Petr,
>
> On Fri, Jan 09, 2026 at 05:41:34PM +0100, Petr Tesarik wrote:
> > Isolation of the least significant bit can be achieved with 3 basic
> > ALU operations which are already open-coded in various places in the
> > kernel.
> >
> > However, since other places less efficient constructs, for example
> > `1UL << ffs(x)`, I assume the trick is known only to some authors, and
> > it's worth adding a helper to promote its use.
>
> Just out of curiosity, are there any existing users employing 1 <<
> ffs(x) (or other inefficient variants) in performance-critical
> hotpaths?
>
> From a quick grep, I only found one instance in drivers/clk/ti/mux.c
> matching the 1 << ffs(x) pattern. However, this doesn't appear to be a
> bottleneck since it is followed by ti_clk_ll_ops->clk_writel(...). The
> latency of the MMIO write would likely overshadow the savings of a few
> ALU cycles.
Most expressions are a bit more complex, like this one in
page_cache_async_ra():
align = 1UL << min(ra->order, ffs(max_pages) - 1);
Or split across multiple lines, like this one in sata_down_spd_limit():
bit = ffs(mask) - 1;
mask = 1 << bit;
I agree that there is most likely no measurable performance win. The
resulting machine code merely looks quite silly on architectures
without an instruction to do ffs() and a little bit silly on
architectures where the instruction has a slightly different semantics
(bit position numbering and/or handling of zero value).
> Additionally, it seems that patch #2 focuses on replacing the x & -x
> implementation with the new API, rather than converting inefficient
> constructs like 1 << ffs(x) to use ffs_val().
This is correct. This patch series merely introduces the helper. Patch
2/2 was created mechanically. I haven't sent any follow-up yet to
replace inefficient use, because that would require more effort, and I'm
trying to get some early feedback first (that's why the series is
tagged RFC).
If we agree that it's worth the effort, and I split the patches by
subsystem, it may become a rather long series.
Petr T
Powered by blists - more mailing lists