lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZOMmuZuhdjA6mdIG@smile.fi.intel.com>
Date:   Mon, 21 Aug 2023 11:56:25 +0300
From:   Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To:     Rasmus Villemoes <linux@...musvillemoes.dk>
Cc:     Yury Norov <yury.norov@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bitmap: optimize bitmap_remap()

On Mon, Aug 21, 2023 at 09:48:38AM +0200, Rasmus Villemoes wrote:
> On 19/08/2023 04.03, Yury Norov wrote:
> > On Thu, Aug 17, 2023 at 06:37:01PM +0300, Andy Shevchenko wrote:
> 
> >> But this gives +89 bytes on x86_64... :-(
> > 
> > Who cares if it gives a boost of performance for regular users?
> 
> "Regular users" never ever hit bitmap_remap, it's simply too esoteric.
> It has all of _two_ users in the whole tree, one in some gpio driver,
> another only reached via a system call that nobody ever uses, and if
> they do, it's most likely some one-time-per-process thing. It's about as
> far from a hot path that you can come.
> 
> If it wasn't for that xilinx user, those bitmap_remap and bitmap_onto
> etc. should be moved to be private to the NUMA code.
> 
> Anyway, I think those +89 was for Andy's own counterproposal. I haven't
> built Yury's patch, but from a quick look, it should not add that much,
> if anything - it adds a test, call, early return, but OTOH it helps the
> compiler to combine the two set_bit (since only the first argument
> differs), and loses the lock prefix.
> 
> As for that latter point, I don't think a separate patch is worth it,
> just a comment in the commit log - we're already doing a bitmap_zero()
> on dst which certainly doesn't use any atomic ops, and in general all
> the bitmap_* functions expect the caller to handle locking.
> 
> So I don't mind Yury's patch, but I highly doubt it matters at all. The
> comment mentions an example, do we even have that put in test code?

Btw, I have locally a patch that adds bitmap_scatter()/bitmap_gather() and
switches Xilins to use it. Because that is what GPIO may use, see the code
below for the reference (there are no test cases yet), Comments are welcome
as usual!

>From 6013cce33dc1e0381240b7955f50044b26dd7659 Mon Sep 17 00:00:00 2001
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
Date: Thu, 20 Oct 2022 00:05:15 +0300
Subject: [PATCH 1/1] lib/bitmap: Introduce bitmap_scatter() and
 bitmap_gather() helpers

These helpers are optimized versions of the bitmap_remap() where one
of the bitmaps (source or destination) is of sequential bits.

See more in the kernel documentation of the helpers.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
---
 include/linux/bitmap.h |  9 ++++++
 lib/bitmap.c           | 70 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 03644237e1ef..d3c9fefde77c 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -60,6 +60,8 @@ struct device;
  *  bitmap_shift_left(dst, src, n, nbits)       *dst = *src << n
  *  bitmap_cut(dst, src, first, n, nbits)       Cut n bits from first, copy rest
  *  bitmap_replace(dst, old, new, mask, nbits)  *dst = (*old & ~(*mask)) | (*new & *mask)
+ *  bitmap_scatter(dst, src, mask, nbits)	*dst = map(dense, sparse)(src)
+ *  bitmap_gather(dst, src, mask, nbits)	*dst = map(sparse, dense)(src)
  *  bitmap_remap(dst, src, old, new, nbits)     *dst = map(old, new)(src)
  *  bitmap_bitremap(oldbit, old, new, nbits)    newbit = map(old, new)(oldbit)
  *  bitmap_onto(dst, orig, relmap, nbits)       *dst = orig relative to relmap
@@ -208,6 +210,12 @@ int bitmap_parselist(const char *buf, unsigned long *maskp,
 			int nmaskbits);
 int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen,
 			unsigned long *dst, int nbits);
+
+unsigned int bitmap_scatter(unsigned long *dst, const unsigned long *src,
+		const unsigned long *mask, unsigned int nbits);
+unsigned int bitmap_gather(unsigned long *dst, const unsigned long *src,
+		const unsigned long *mask, unsigned int nbits);
+
 void bitmap_remap(unsigned long *dst, const unsigned long *src,
 		const unsigned long *old, const unsigned long *new, unsigned int nbits);
 int bitmap_bitremap(int oldbit,
@@ -216,6 +224,7 @@ void bitmap_onto(unsigned long *dst, const unsigned long *orig,
 		const unsigned long *relmap, unsigned int bits);
 void bitmap_fold(unsigned long *dst, const unsigned long *orig,
 		unsigned int sz, unsigned int nbits);
+
 int bitmap_find_free_region(unsigned long *bitmap, unsigned int bits, int order);
 void bitmap_release_region(unsigned long *bitmap, unsigned int pos, int order);
 int bitmap_allocate_region(unsigned long *bitmap, unsigned int pos, int order);
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 24284caadbcc..dd27f0a5d1ce 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -942,6 +942,76 @@ int bitmap_parse(const char *start, unsigned int buflen,
 }
 EXPORT_SYMBOL(bitmap_parse);
 
+/**
+ * bitmap_scatter - Scatter a bitmap according to the given mask
+ * @dst: scattered bitmap
+ * @src: gathered bitmap
+ * @mask: bits to assign to in the scattered bitmap
+ * @nbits: number of bits in each of these bitmaps
+ *
+ * Scatters bitmap with sequential bits according to the given @mask.
+ *
+ * Example:
+ * If @src bitmap = 0x005a, with @mask = 0x1313, @dst will be 0x0302.
+ *
+ * Or in binary form
+ * @src			@mask			@dst
+ * 0000000001011010	0001001100010011	0000001100000010
+ *
+ * (Bits 0, 1, 2, 3, 4, 5 are copied to the bits 0, 1, 4, 8, 9, 12)
+ *
+ * Returns: the weight of the @mask.
+ */
+unsigned int bitmap_scatter(unsigned long *dst, const unsigned long *src,
+			    const unsigned long *mask, unsigned int nbits)
+{
+	unsigned int bit;
+	int n = 0;
+
+	bitmap_zero(dst, nbits);
+
+	for_each_set_bit(bit, mask, nbits)
+		__assign_bit(bit, dst, test_bit(n++, src));
+
+	return n;
+}
+EXPORT_SYMBOL(bitmap_scatter);
+
+/**
+ * bitmap_gather - Gather a bitmap according to given mask
+ * @dst: gathered bitmap
+ * @src: scattered bitmap
+ * @mask: bits to extract from in the scattered bitmap
+ * @nbits: number of bits in each of these bitmaps
+ *
+ * Gathers bitmap with sparse bits according to the given @mask.
+ *
+ * Example:
+ * If @src bitmap = 0x0302, with @mask = 0x1313, @dst will be 0x001a.
+ *
+ * Or in binary form
+ * @src			@mask			@dst
+ * 0000001100000010	0001001100010011	0000000000011010
+ *
+ * (Bits 0, 1, 4, 8, 9, 12 are copied to the bits 0, 1, 2, 3, 4, 5)
+ *
+ * Returns: the weight of the @mask.
+ */
+unsigned int bitmap_gather(unsigned long *dst, const unsigned long *src,
+			   const unsigned long *mask, unsigned int nbits)
+{
+	unsigned int bit;
+	int n = 0;
+
+	bitmap_zero(dst, nbits);
+
+	for_each_set_bit(bit, mask, nbits)
+		__assign_bit(n++, dst, test_bit(bit, src));
+
+	return n;
+}
+EXPORT_SYMBOL(bitmap_gather);
+
 /**
  * bitmap_pos_to_ord - find ordinal of set bit at given position in bitmap
  *	@buf: pointer to a bitmap

-- 
With Best Regards,
Andy Shevchenko


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ