linux-kernel - Re: [PATCH v2] clocksource: Replace loop within clocks_calc_mult_shift() with clz-based calculation for sftacc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANDhNCrez0=fKoH=1Kg6SG08yPmZ3q3N7rD+Dm3e8=-hHoj_6w@mail.gmail.com>
Date: Tue, 10 Jun 2025 21:31:49 -0700
From: John Stultz <jstultz@...gle.com>
To: I Hsin Cheng <richard120310@...il.com>
Cc: tglx@...utronix.de, sboyd@...nel.org, linux-kernel@...r.kernel.org, 
	skhan@...uxfoundation.org, linux-kernel-mentees@...ts.linux.dev, 
	jserv@...s.ncku.edu.tw
Subject: Re: [PATCH v2] clocksource: Replace loop within clocks_calc_mult_shift()
 with clz-based calculation for sftacc

On Tue, Jun 10, 2025 at 8:30 PM I Hsin Cheng <richard120310@...il.com> wrote:
>
> v1 -> v2:
>         - Refine commit message to explain more about "why"
>         - Check the frequency of "clocks_calc_mult_shift()" get called,
>           it's not in hotpath on my machine, refine the commit message
>       to avoid overselling it
>         - Add comments for the code to explain the implementation in
>           more detail
>         - Handle case for "tmp == 0" to avoid undefined behavior
>     - Experiment using ffs() but it's used for finding the LSB of
>       a number, which isn't MSB as we want. It would still need
>       looping when intended to use ffs() in this scenario

Oh, apologies for mixing that up and leading you astray!


> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 2a7802ec480c..59145d762f1e 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -66,10 +66,20 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 maxsec)
>          * range:
>          */
>         tmp = ((u64)maxsec * from) >> 32;
> -       while (tmp) {
> -               tmp >>=1;
> -               sftacc--;
> -       }
> +
> +       /*
> +        * Decrement "sftacc" by the number of bits needed to represent "tmp".
> +        * Using (32 - __builtin_clz(tmp)) to ge the bit width:
> +        * - __builtin_clz(tmp) returns the count of leading zero bits
> +        * - (32 - __builtin_clz(tmp)) gives us the number of significant bits
> +        *
> +        * Note: It use 32 instead of 31 because we want bit width (0-index MSB
> +        * position + 1), not just the MSB position itself.
> +        *
> +        * Handle tmp == 0 separately since __builtin_clz(0) has undefined behavior.
> +        */
> +       if (tmp)
> +               sftacc -= (32 - __builtin_clz(tmp));

Still think the detail that __builtin_clz only works on the bottom
32bits is good to highlight in the comment.
Though maybe, would explicitly casting tmp to a u32 help it be more clear here?

thanks
-john