lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 5 May 2024 13:11:53 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Yury Norov' <yury.norov@...il.com>, Kuan-Wei Chiu <visitorckw@...il.com>
CC: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"linux@...musvillemoes.dk" <linux@...musvillemoes.dk>,
	"n26122115@...ncku.edu.tw" <n26122115@...ncku.edu.tw>,
	"jserv@...s.ncku.edu.tw" <jserv@...s.ncku.edu.tw>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v4 1/2] lib/test_bitops: Add benchmark test for fns()

From: Yury Norov
> Sent: 01 May 2024 17:30
> 
> On Wed, May 01, 2024 at 09:20:46PM +0800, Kuan-Wei Chiu wrote:
> > Introduce a benchmark test for the fns(). It measures the total time
> > taken by fns() to process 1,000,000 test data generated using
> > get_random_bytes() for each n in the range [0, BITS_PER_LONG).
> >
> > example:
> > test_bitops: fns:          5876762553 ns, 64000000 iterations
> 
> So... 5 seconds for a test sounds too much. I see the following patch
> improves it dramatically, but in general let's stay in a range of
> milliseconds. On other machines it may run much slower and trigger
> a stall watchdog.
> 
> > Signed-off-by: Kuan-Wei Chiu <visitorckw@...il.com>
> 
> Suggested-by: Yury Norov <yury.norov@...il.com>
> 
> > ---
> >
> > Changes in v4:
> > - Correct get_random_long() -> get_random_bytes() in the commit
> >   message.
> >
> >  lib/test_bitops.c | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> >
> > diff --git a/lib/test_bitops.c b/lib/test_bitops.c
> > index 3b7bcbee84db..ed939f124417 100644
> > --- a/lib/test_bitops.c
> > +++ b/lib/test_bitops.c
> > @@ -50,6 +50,26 @@ static unsigned long order_comb_long[][2] = {
> >  };
> >  #endif
> >
> > +static unsigned long buf[1000000];
> 
> Can you make it __init, or allocate with kmalloc_array(), so that 64M
> of memory will not last forever in the kernel?
> 
> > +static int __init test_fns(void)
> > +{
> > +	unsigned int i, n;
> > +	ktime_t time;
> > +
> > +	get_random_bytes(buf, sizeof(buf));
> > +	time = ktime_get();
> > +
> > +	for (n = 0; n < BITS_PER_LONG; n++)
> > +		for (i = 0; i < 1000000; i++)
> > +			fns(buf[i], n);
> 
> What concerns me here is that fns() is a in fact a const function, and
> the whole loop may be eliminated. Can you make sure it's not your case
> because 450x performance boost sounds a bit too much to me.
> 
> You can declare a "static volatile __used __init" variable to assign
> the result of fns(), and ensure that the code is not eliminated

Yep, without 'c' this compiler to 'return 0'.

static inline unsigned long fns(unsigned long word, unsigned int n)
{
	while (word && n--)
		word &= word - 1;
	return word ? __builtin_ffs(word) : 8 * sizeof (long);
}

unsigned long buf[1000000];

volatile int c;

int  test_fns(void)
{
	unsigned int i, n;

	for (n = 0; n < 8*sizeof (long); n++)
		for (i = 0; i < 1000000; i++)
			c = fns(buf[i], n);
	return 0;
}

You are also hitting the random number generator.
It would be better to use a predictable sequence.
Then you could, for instance, add up all the fns() results
and check you get the expected value.

With a really trivial 'RNG' (like step a CRC one bit) you
could do it inside the loop and not nee a buffer at all.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ