[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZjJt9w2mvdm2P+dM@yury-ThinkPad>
Date: Wed, 1 May 2024 09:29:43 -0700
From: Yury Norov <yury.norov@...il.com>
To: Kuan-Wei Chiu <visitorckw@...il.com>
Cc: akpm@...ux-foundation.org, linux@...musvillemoes.dk,
n26122115@...ncku.edu.tw, jserv@...s.ncku.edu.tw,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/2] lib/test_bitops: Add benchmark test for fns()
On Wed, May 01, 2024 at 09:20:46PM +0800, Kuan-Wei Chiu wrote:
> Introduce a benchmark test for the fns(). It measures the total time
> taken by fns() to process 1,000,000 test data generated using
> get_random_bytes() for each n in the range [0, BITS_PER_LONG).
>
> example:
> test_bitops: fns: 5876762553 ns, 64000000 iterations
So... 5 seconds for a test sounds too much. I see the following patch
improves it dramatically, but in general let's stay in a range of
milliseconds. On other machines it may run much slower and trigger
a stall watchdog.
> Signed-off-by: Kuan-Wei Chiu <visitorckw@...il.com>
Suggested-by: Yury Norov <yury.norov@...il.com>
> ---
>
> Changes in v4:
> - Correct get_random_long() -> get_random_bytes() in the commit
> message.
>
> lib/test_bitops.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/lib/test_bitops.c b/lib/test_bitops.c
> index 3b7bcbee84db..ed939f124417 100644
> --- a/lib/test_bitops.c
> +++ b/lib/test_bitops.c
> @@ -50,6 +50,26 @@ static unsigned long order_comb_long[][2] = {
> };
> #endif
>
> +static unsigned long buf[1000000];
Can you make it __init, or allocate with kmalloc_array(), so that 64M
of memory will not last forever in the kernel?
> +static int __init test_fns(void)
> +{
> + unsigned int i, n;
> + ktime_t time;
> +
> + get_random_bytes(buf, sizeof(buf));
> + time = ktime_get();
> +
> + for (n = 0; n < BITS_PER_LONG; n++)
> + for (i = 0; i < 1000000; i++)
> + fns(buf[i], n);
What concerns me here is that fns() is a in fact a const function, and
the whole loop may be eliminated. Can you make sure it's not your case
because 450x performance boost sounds a bit too much to me.
You can declare a "static volatile __used __init" variable to assign
the result of fns(), and ensure that the code is not eliminated
> + time = ktime_get() - time;
> + pr_err("fns: %18llu ns, %6d iterations\n", time, BITS_PER_LONG * 1000000);
Here the number of iterations is always the same. What's the point to
print it?
Thanks,
Yury
Powered by blists - more mailing lists