lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAH5jb=bptDp43mLJ7A3nZuBnDB=V_wjLa3XqCSfsG8sC0OoFyg@mail.gmail.com>
Date: Sat, 30 Mar 2024 16:45:13 +0800
From: I Hsin Cheng <richard120310@...il.com>
To: Jens Axboe <axboe@...nel.dk>
Cc: Bart Van Assche <bvanassche@....org>, akpm@...ux-foundation.org, 
	linux-block@...r.kernel.org, linux-kernel@...r.kernel.org, 
	"Ching-Chun (Jim) Huang" <jserv@...s.ncku.edu.tw>
Subject: Re: [PATCH] blk-wbt: Speed up integer square root in rwb_arm_timer

The last email didn't follow the plain-text format, I'm sorry for that,
 that's why I resend it. Sorry for the bother.

>> Additionally, why to add a second
>> implementation of int_sqrt() instead of replacing the int_sqrt()
>> implementation in lib/math/int_sqrt.c?

I was thinking about adding an alternative option first rather than
replace the whole int_sqrt() function which is used in many other
parts in the Linux kernel.

>> Since int_sqrt() does not use divisions and since int_fastsqrt() uses
>> divisions, can all CPUs supported by the Linux kernel divide numbers as
>> quickly as the CPU mentioned above?

You're right about that. Thanks for pointing out the problem, I'll try to
replace the divisions maybe with another kind of approximation method.

> The claim that it is floor(sqrt(val)) is not true.
> Trivial example:
>
> 1005117225
>         sqrt()          31703.58
>         int_sqrt()      30703
>         int_fastsqrt()  30821

Thanks for pointing out the problem, I only compare my method with int_sqrt()
and plot the result using gnuplot, the result shown that they gave
very very close
answers, but I didn't count the error based on the integer part of
sqrt(), which is
indeed necessary. Sorry for this part. I'll check on the precision of my method
again.

Thanks for your patience and time on reviewing my patch.


Best Regards,

I Hsin Cheng.


On Sat, Mar 30, 2024 at 4:29 PM 鄭以新 <richard120310@...il.com> wrote:
>
> >> Additionally, why to add a second
> >> implementation of int_sqrt() instead of replacing the int_sqrt()
> >> implementation in lib/math/int_sqrt.c?
>
> I was thinking about adding an alternative option first rather than
> replace the whole int_sqrt() function which is used in many other
> parts in the Linux kernel.
>
> >> Since int_sqrt() does not use divisions and since int_fastsqrt() uses
> >> divisions, can all CPUs supported by the Linux kernel divide numbers as
> >> quickly as the CPU mentioned above?
>
> You're right about that. Thanks for pointing out the problem, I'll try to
> replace the divisions maybe with another kind of approximation method.
>
> > The claim that it is floor(sqrt(val)) is not true.
> > Trivial example:
> >
> > 1005117225
> >         sqrt()          31703.58
> >         int_sqrt()      30703
> >         int_fastsqrt()  30821
>
> Thanks for pointing out the problem, I only compare my method with int_sqrt()
>  and plot the result using gnuplot, the result shown that they gave very very close
> answers, but I didn't count the error based on the integer part of sqrt(), which is
> indeed necessary. Sorry for this part. I'll check on the precision of my method
> again.
>
> Thanks for your patience and time on reviewing my patch.
>
> Best Regards,
>
> I Hsin Cheng.
>
> Jens Axboe <axboe@...nel.dk> 於 2024年3月30日 週六 上午3:12寫道:
>>
>> On 3/29/24 12:15 PM, Bart Van Assche wrote:
>> > On 3/29/24 2:12 AM, I Hsin Cheng wrote:
>> >> As the result shown, the origin version of integer square root, which is
>> >> "int_sqrt" takes 35.37 msec task-clock, 1,2181,3348 cycles, 1,6095,3665
>> >> instructions, 2551,2990 branches and causes 1,0616 branch-misses.
>> >>
>> >> At the same time, the variant version of integer square root, which is
>> >> "int_fastsqrt" takes 33.96 msec task-clock, 1,1645,7487 cyclces,
>> >> 5621,0086 instructions, 321,0409 branches and causes 2407 branch-misses.
>> >> We can clearly see that "int_fastsqrt" performs faster and better result
>> >> so it's indeed a faster invariant of integer square root.
>> >
>> > I'm not sure that a 4% performance improvement is sufficient to
>> > replace the int_sqrt() implementation. Additionally, why to add a
>> > second implementation of int_sqrt() instead of replacing the
>> > int_sqrt() implementation in lib/math/int_sqrt.c?
>>
>> That's the real question imho - if provides the same numbers and is
>> faster, why have two?
>>
>> I ran a quick test because I was curious, and the precision is
>> definitely worse. The claim that it is floor(sqrt(val)) is not true.
>> Trivial example:
>>
>> 1005117225
>>         sqrt()          31703.58
>>         int_sqrt()      30703
>>         int_fastsqrt()  30821
>>
>> whether this matters, probably not, but then again it's hard to care
>> about a slow path sqrt calculation. I'd certainly err on the side of
>> precision for that.
>>
>> --
>> Jens Axboe
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ