[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240128050449.1332798-4-mailhol.vincent@wanadoo.fr>
Date: Sun, 28 Jan 2024 14:00:09 +0900
From: Vincent Mailhol <mailhol.vincent@...adoo.fr>
To: Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org
Cc: Yury Norov <yury.norov@...il.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Douglas Anderson <dianders@...omium.org>,
Kees Cook <keescook@...omium.org>,
Petr Mladek <pmladek@...e.com>,
Randy Dunlap <rdunlap@...radead.org>,
Zhaoyang Huang <zhaoyang.huang@...soc.com>,
Geert Uytterhoeven <geert+renesas@...der.be>,
Marco Elver <elver@...gle.com>,
Brian Cain <bcain@...cinc.com>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Matthew Wilcox <willy@...radead.org>,
"Paul E . McKenney" <paulmck@...nel.org>,
linux-m68k@...ts.linux-m68k.org,
Vincent Mailhol <mailhol.vincent@...adoo.fr>
Subject: [PATCH v4 3/5] hexagon/bitops: force inlining of all bit-find functions
The inline keyword actually does not guarantee that the compiler will
inline a functions. Whenever the goal is to actually inline a
function, __always_inline should always be preferred instead.
__always_inline is also needed for further optimizations which will
come up in a follow-up patch.
Inline all the bit-find function which have a custom hexagon assembly
implementation, namely: __ffs(), ffs(), ffz(), __fls(), fls().
On linux v6.7 defconfig with clang 17.0.6, it does not impact the
final size, meaning that, overall, those function were already inlined
on modern clangs:
$ size --format=GNU vmlinux.before vmlinux.after vmlinux.final
text data bss total filename
4827900 1798340 364057 6990297 vmlinux.before
4827900 1798340 364057 6990297 vmlinux.after
Reference: commit 8dd5032d9c54 ("x86/asm/bitops: Force inlining of test_and_set_bit and friends")
Link: https://git.kernel.org/torvalds/c/8dd5032d9c54
Signed-off-by: Vincent Mailhol <mailhol.vincent@...adoo.fr>
---
arch/hexagon/include/asm/bitops.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/hexagon/include/asm/bitops.h b/arch/hexagon/include/asm/bitops.h
index 160d8f37fa1a..e856d6dbfe16 100644
--- a/arch/hexagon/include/asm/bitops.h
+++ b/arch/hexagon/include/asm/bitops.h
@@ -200,7 +200,7 @@ arch_test_bit_acquire(unsigned long nr, const volatile unsigned long *addr)
*
* Undefined if no zero exists, so code should check against ~0UL first.
*/
-static inline long ffz(int x)
+static __always_inline long ffz(int x)
{
int r;
@@ -217,7 +217,7 @@ static inline long ffz(int x)
* This is defined the same way as ffs.
* Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
*/
-static inline int fls(unsigned int x)
+static __always_inline int fls(unsigned int x)
{
int r;
@@ -238,7 +238,7 @@ static inline int fls(unsigned int x)
* the libc and compiler builtin ffs routines, therefore
* differs in spirit from the above ffz (man ffs).
*/
-static inline int ffs(int x)
+static __always_inline int ffs(int x)
{
int r;
@@ -260,7 +260,7 @@ static inline int ffs(int x)
* bits_per_long assumed to be 32
* numbering starts at 0 I think (instead of 1 like ffs)
*/
-static inline unsigned long __ffs(unsigned long word)
+static __always_inline unsigned long __ffs(unsigned long word)
{
int num;
@@ -278,7 +278,7 @@ static inline unsigned long __ffs(unsigned long word)
* Undefined if no set bit exists, so code should check against 0 first.
* bits_per_long assumed to be 32
*/
-static inline unsigned long __fls(unsigned long word)
+static __always_inline unsigned long __fls(unsigned long word)
{
int num;
--
2.43.0
Powered by blists - more mailing lists