[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190825061158.GC28002@bombadil.infradead.org>
Date: Sat, 24 Aug 2019 23:11:58 -0700
From: Matthew Wilcox <willy@...radead.org>
To: Denis Efremov <efremov@...ras.ru>
Cc: akpm@...ux-foundation.org, Akinobu Mita <akinobu.mita@...il.com>,
Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org,
Matthew Wilcox <matthew@....cx>, dm-devel@...hat.com,
linux-fsdevel@...r.kernel.org, linux-media@...r.kernel.org,
Erdem Tumurov <erdemus@...il.com>,
Vladimir Shelekhov <vshel@....nsk.su>
Subject: Re: [PATCH v2] lib/memweight.c: open codes bitmap_weight()
On Sat, Aug 24, 2019 at 01:01:02PM +0300, Denis Efremov wrote:
> This patch open codes the bitmap_weight() call. The direct
> invocation of hweight_long() allows to remove the BUG_ON and
> excessive "longs to bits, bits to longs" conversion.
Honestly, that's not the problem with this function. Take a look
at https://danluu.com/assembly-intrinsics/ for a _benchmarked_
set of problems with popcnt.
> BUG_ON was required to check that bitmap_weight() will return
> a correct value, i.e. the computed weight will fit the int type
> of the return value.
What? No. Look at the _arguments_ of bitmap_weight():
static __always_inline int bitmap_weight(const unsigned long *src, unsigned int nbits)
> With this patch memweight() controls the
> computation directly with size_t type everywhere. Thus, the BUG_ON
> becomes unnecessary.
Why are you bothering? How are you allocating half a gigabyte of memory?
Why are you calling memweight() on half a gigabyte of memory?
> if (longs) {
> - BUG_ON(longs >= INT_MAX / BITS_PER_LONG);
> - ret += bitmap_weight((unsigned long *)bitmap,
> - longs * BITS_PER_LONG);
> + const unsigned long *bitmap_long =
> + (const unsigned long *)bitmap;
> +
> bytes -= longs * sizeof(long);
> - bitmap += longs * sizeof(long);
> + for (; longs > 0; longs--, bitmap_long++)
> + ret += hweight_long(*bitmap_long);
> + bitmap = (const unsigned char *)bitmap_long;
> }
If you really must change anything, I'd rather see this turned into a
loop:
while (longs) {
unsigned int nbits;
if (longs >= INT_MAX / BITS_PER_LONG)
nbits = INT_MAX + 1;
else
nbits = longs * BITS_PER_LONG;
ret += bitmap_weight((unsigned long *)bitmap, sz);
bytes -= nbits / 8;
bitmap += nbits / 8;
longs -= nbits / BITS_PER_LONG;
}
then we only have to use Dan Luu's optimisation in bitmap_weight()
and not in memweight() as well.
Also, why does the trailer do this:
for (; bytes > 0; bytes--, bitmap++)
ret += hweight8(*bitmap);
instead of calling hweight_long on *bitmap & mask?
Powered by blists - more mailing lists