[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <848d1259-ff6e-4732-b840-a02a5e5fe2cb@acm.org>
Date: Fri, 29 Mar 2024 11:15:38 -0700
From: Bart Van Assche <bvanassche@....org>
To: I Hsin Cheng <richard120310@...il.com>, axboe@...nel.dk
Cc: akpm@...ux-foundation.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] blk-wbt: Speed up integer square root in rwb_arm_timer
On 3/29/24 2:12 AM, I Hsin Cheng wrote:
> As the result shown, the origin version of integer square root, which is
> "int_sqrt" takes 35.37 msec task-clock, 1,2181,3348 cycles, 1,6095,3665
> instructions, 2551,2990 branches and causes 1,0616 branch-misses.
>
> At the same time, the variant version of integer square root, which is
> "int_fastsqrt" takes 33.96 msec task-clock, 1,1645,7487 cyclces,
> 5621,0086 instructions, 321,0409 branches and causes 2407 branch-misses.
> We can clearly see that "int_fastsqrt" performs faster and better result
> so it's indeed a faster invariant of integer square root.
I'm not sure that a 4% performance improvement is sufficient to replace
the int_sqrt() implementation. Additionally, why to add a second
implementation of int_sqrt() instead of replacing the int_sqrt()
implementation in lib/math/int_sqrt.c?
> The experiments runs on x86_64 GNU/Linux Architecture and the CPU is
> Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz.
Since int_sqrt() does not use divisions and since int_fastsqrt() uses
divisions, can all CPUs supported by the Linux kernel divide numbers as
quickly as the CPU mentioned above?
Thanks,
Bart.
Powered by blists - more mailing lists