[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCrez0=fKoH=1Kg6SG08yPmZ3q3N7rD+Dm3e8=-hHoj_6w@mail.gmail.com>
Date: Tue, 10 Jun 2025 21:31:49 -0700
From: John Stultz <jstultz@...gle.com>
To: I Hsin Cheng <richard120310@...il.com>
Cc: tglx@...utronix.de, sboyd@...nel.org, linux-kernel@...r.kernel.org,
skhan@...uxfoundation.org, linux-kernel-mentees@...ts.linux.dev,
jserv@...s.ncku.edu.tw
Subject: Re: [PATCH v2] clocksource: Replace loop within clocks_calc_mult_shift()
with clz-based calculation for sftacc
On Tue, Jun 10, 2025 at 8:30 PM I Hsin Cheng <richard120310@...il.com> wrote:
>
> v1 -> v2:
> - Refine commit message to explain more about "why"
> - Check the frequency of "clocks_calc_mult_shift()" get called,
> it's not in hotpath on my machine, refine the commit message
> to avoid overselling it
> - Add comments for the code to explain the implementation in
> more detail
> - Handle case for "tmp == 0" to avoid undefined behavior
> - Experiment using ffs() but it's used for finding the LSB of
> a number, which isn't MSB as we want. It would still need
> looping when intended to use ffs() in this scenario
Oh, apologies for mixing that up and leading you astray!
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 2a7802ec480c..59145d762f1e 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -66,10 +66,20 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 maxsec)
> * range:
> */
> tmp = ((u64)maxsec * from) >> 32;
> - while (tmp) {
> - tmp >>=1;
> - sftacc--;
> - }
> +
> + /*
> + * Decrement "sftacc" by the number of bits needed to represent "tmp".
> + * Using (32 - __builtin_clz(tmp)) to ge the bit width:
> + * - __builtin_clz(tmp) returns the count of leading zero bits
> + * - (32 - __builtin_clz(tmp)) gives us the number of significant bits
> + *
> + * Note: It use 32 instead of 31 because we want bit width (0-index MSB
> + * position + 1), not just the MSB position itself.
> + *
> + * Handle tmp == 0 separately since __builtin_clz(0) has undefined behavior.
> + */
> + if (tmp)
> + sftacc -= (32 - __builtin_clz(tmp));
Still think the detail that __builtin_clz only works on the bottom
32bits is good to highlight in the comment.
Though maybe, would explicitly casting tmp to a u32 help it be more clear here?
thanks
-john
Powered by blists - more mailing lists