[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220624121313.2382500-9-alexandr.lobakin@intel.com>
Date: Fri, 24 Jun 2022 14:13:12 +0200
From: Alexander Lobakin <alexandr.lobakin@...el.com>
To: Arnd Bergmann <arnd@...db.de>, Yury Norov <yury.norov@...il.com>
Cc: Alexander Lobakin <alexandr.lobakin@...el.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Mark Rutland <mark.rutland@....com>,
Matt Turner <mattst88@...il.com>,
Brian Cain <bcain@...cinc.com>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Yoshinori Sato <ysato@...rs.sourceforge.jp>,
Rich Felker <dalias@...c.org>,
"David S. Miller" <davem@...emloft.net>,
Kees Cook <keescook@...omium.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Marco Elver <elver@...gle.com>, Borislav Petkov <bp@...e.de>,
Tony Luck <tony.luck@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Nathan Chancellor <nathan@...nel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Tom Rix <trix@...hat.com>, kernel test robot <lkp@...el.com>,
linux-alpha@...r.kernel.org, linux-hexagon@...r.kernel.org,
linux-ia64@...r.kernel.org, linux-m68k@...ts.linux-m68k.org,
linux-sh@...r.kernel.org, sparclinux@...r.kernel.org,
linux-arch@...r.kernel.org, llvm@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: [PATCH v5 8/9] bitmap: don't assume compiler evaluates small mem*() builtins calls
Intel kernel bot triggered the build bug on ARC architecture that
in fact is as follows:
DECLARE_BITMAP(bitmap, BITS_PER_LONG);
bitmap_clear(bitmap, 0, BITS_PER_LONG);
BUILD_BUG_ON(!__builtin_constant_p(*bitmap));
which can be expanded to:
unsigned long bitmap[1];
memset(bitmap, 0, sizeof(*bitmap));
BUILD_BUG_ON(!__builtin_constant_p(*bitmap));
In most cases, a compiler is able to expand small/simple mem*()
calls to simple assignments or bitops, in this case that would mean:
unsigned long bitmap[1] = { 0 };
BUILD_BUG_ON(!__builtin_constant_p(*bitmap));
and on most architectures this works, but not on ARC, despite having
-O3 for every build.
So, to make this work, in case when the last bit to modify is still
within the first long (small_const_nbits()), just use plain
assignments for the rest of bitmap_*() functions which still use
mem*(), but didn't receive such compile-time optimizations yet.
This doesn't have the same coverage as compilers provide, but at
least something to start:
text: add/remove: 3/7 grow/shrink: 43/78 up/down: 1848/-3370 (-1546)
data: add/remove: 1/11 grow/shrink: 0/8 up/down: 4/-356 (-352)
notably cpumask_*() family when NR_CPUS <= BITS_PER_LONG:
netif_get_num_default_rss_queues 38 4 -34
cpumask_copy 90 - -90
cpumask_clear 146 - -146
and the abovementioned assertion started passing.
Signed-off-by: Alexander Lobakin <alexandr.lobakin@...el.com>
---
include/linux/bitmap.h | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 2e6cd5681040..a0f4f3af8d30 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -238,20 +238,32 @@ extern int bitmap_print_list_to_buf(char *buf, const unsigned long *maskp,
static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
- memset(dst, 0, len);
+
+ if (small_const_nbits(nbits))
+ *dst = 0;
+ else
+ memset(dst, 0, len);
}
static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
{
unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
- memset(dst, 0xff, len);
+
+ if (small_const_nbits(nbits))
+ *dst = ~0UL;
+ else
+ memset(dst, 0xff, len);
}
static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
unsigned int nbits)
{
unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
- memcpy(dst, src, len);
+
+ if (small_const_nbits(nbits))
+ *dst = *src;
+ else
+ memcpy(dst, src, len);
}
/*
@@ -431,6 +443,8 @@ static __always_inline void bitmap_set(unsigned long *map, unsigned int start,
{
if (__builtin_constant_p(nbits) && nbits == 1)
__set_bit(start, map);
+ else if (small_const_nbits(start + nbits))
+ *map |= GENMASK(start + nbits - 1, start);
else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
@@ -445,6 +459,8 @@ static __always_inline void bitmap_clear(unsigned long *map, unsigned int start,
{
if (__builtin_constant_p(nbits) && nbits == 1)
__clear_bit(start, map);
+ else if (small_const_nbits(start + nbits))
+ *map &= ~GENMASK(start + nbits - 1, start);
else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
--
2.36.1
Powered by blists - more mailing lists