lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9922942e-ea9e-4cdb-a091-5b8ea0a180d8@salutedevices.com>
Date: Sat, 24 Feb 2024 00:47:35 +0300
From: Jan Dakinevich <jan.dakinevich@...utedevices.com>
To: Stephen Boyd <sboyd@...nel.org>, Michael Turquette
	<mturquette@...libre.com>, <linux-clk@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
CC: <kernel@...utedevices.com>
Subject: Re: [PATCH] clk: allow to skip clk_core_req_round_rate_nolock()



On 2/23/24 02:20, Stephen Boyd wrote:
> Quoting Jan Dakinevich (2024-01-26 12:14:33)
>> Calling of clk_core_req_round_rate_nolock() can be time-consuming in a
>> case of deep hierarchy with multiple dividers/parents. But if the clock
>> already has exactly the same rate as desired, there is no need to
>> determine how it could be rounded.
> 
> What exactly are you trying to avoid? Is this an optimization or a bug
> fix? TL;DR: I'm unlikely to apply this patch.
> 

It is an optimization, not a bug. The problem is that 
clk_core_req_round_rate_nolock() is quite expensive, and I faced with 
cases, where it takes tens and hundreds milliseconds (depending on SoC).

As I see, it is irremovable feature of clk_core_req_round_rate_nolock() 
design itself. Lets imagine, we have some clock, and its parent is a 
divider. When clk_core_req_round_rate_nolock() is being called the 
execution is walked through the following path:

clk_core_determine_round_nolock
  core->ops->determine_rate
    divider_determine_rate
     clk_divider_bestdiv

Inside clk_divider_bestdiv() for each possible divider 
clk_hw_round_rate() is called for parent of the clock, which in turn 
calls clk_core_determine_round_nolock().

So, each divider and multiplexer in clock path multiplies many times an 
amount of iteration required to execute 
clk_core_req_round_rate_nolock(). When there are a lot of them the time 
consumed by clk_core_req_round_rate_nolock() becomes sufficient.

> I could see some driver implementing round_rate()/determine_rate() in a
> way that rounds the rate passed in, so that even if the rate is what the
> clk is running at _right now_, it still wants to change it to something
> else, or at least call down into the driver to call the set_rate clk_op.
> Applying this patch will break that. The contract is that
> clk_set_rate(rate) == clk_set_rate(clk_round_rate(rate)). It doesn't
> look like anything needs to change.

If I am not mistaken, clocks's rate is either equal to its parent rate 
or calculated by ->recalc_rate(). I suppose, this callback should return 
valid rate value that is based on current clock parameters.

Now, suppose the clock has rate "rateA" and we called clk_set_rate() to 
set "rateA", but clk_core_req_round_rate_nolock() inside clk_set_rate() 
rounds it to "rateB". Thus, although the clock is able to run on desired 
rate (and actually run on it), ->determine_rate() and ->round_rate() are 
unable to choose clocks's parameters for that value. Is it correct 
behavior for clock driver?



-- 
Best regards
Jan Dakinevich

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ