linux-kernel - Re: [RFC PATCH] clocksource: Enhancement for clocks_calc_mult

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <aEfQauoqhEs2DkiG@vaxr-BM6660-BM6360>
Date: Tue, 10 Jun 2025 14:27:54 +0800
From: I Hsin Cheng <richard120310@...il.com>
To: John Stultz <jstultz@...gle.com>
Cc: sboyd@...nel.org, tglx@...utronix.de, linux-kernel@...r.kernel.org,
	skhan@...uxfoundation.org, linux-kernel-mentees@...ts.linux.dev,
	jserv@...s.ncku.edu.tw
Subject: Re: [RFC PATCH] clocksource: Enhancement for clocks_calc_mult_shift()

On Mon, Jun 09, 2025 at 05:32:05PM -0700, John Stultz wrote:
> Thanks so much for sending this optimization in! It looks promising,
> but a few comments below:
> 
> On Mon, Jun 9, 2025 at 12:46 PM I Hsin Cheng <richard120310@...il.com> wrote:
> >
> > Previously, counting the value of "sftacc" within
> > "clocks_calc_mult_shift()" use a while loop to determine the amount of
> > decrement operation for "sftacc".
> 
> This is just a direct translation of the code into english, without
> any context or explanation as to *why* the code is doing what it is
> doing.
> 
> It might be better to explain first *why* the calculation is being
> done, and then get into the limitations of the current implementation.
>

Sure thing, I'll change this part.

> > It's equivalent to the position of MSB of "tmp", for which we can
> > derive from (32 - __builtin_clz(tmp)). Use 32 instead of 31 here because
> > 1UL should be treat as an amount of 1, not the index 0, and even though
> > "tmp" is of type u64, since it's already shifted right by 32, only the
> > lowest 32 bits will be possible to have non-zero value.
> 
> I was having a bit of a hard time parsing this paragraph, so it
> probably needs some more work.
> 
> On first skim, this also was a bit confusing as __builtin_clz(u64)
> *seems* like it could return greater than 32 (but that would be
> __builtin_clzll).  So the (32 - clz(u64)) bit might confuse folks, as
> It's pretty subtle that __builtin_clz() only works on the bottom
> 32bits.  The code should definitely get clear a comment on this.
> 
> And thinking more, would using ffs() be more straightforward here for
> what is wanted?
> 

This is a nice head up, thanks so much ! I'll switch to ffs() and see
how it goes.

> > This change is tested against a test script [1].
> > Result shown that it can save a significant amount of execution overhead
> > for this function, which is invoked often, the whole system would likely
> > benefit from it.
> >
> > -----------------------------
> > | old version | new version |
> > -----------------------------
> > |  11500.6 ns |       44 ns |
> > -----------------------------
> >
> > The execution time is reduced around 99.7%
> 
> That is impressive, but should probably be contextualized around how
> often one might expect clocks_calc_mult_shift() to be called.
> It still looks like a good optimization, but we shouldn't over-sell it
> unless there's a concrete impact to users.
> 

No problem, I'll try to use bpf or some debug prints approach to test
the invoked frequency of this function, what do you think ? maybe
there're better and more elegant approach to do such profiling ?

Btw I ran each test for 10 times and take the average of them, I should
amend this in the commit message.

> Thanks again for sending this along! I'd just take another pass at the
> commit message and the comments and see if this can be made easier to
> understand when someone 5 years from now is looking at the code.
> 
> thanks
> -john

Hi John,

Thanks for your review and these suggestions !
I'll refine the commit message and do more test ( e.g. use ffs() ) to
see if it can be better, I agree the commit message might be too
confusing, I'll work on that more, thanks for your time!

Best regards,
I Hsin Cheng