linux-kernel - Re: [PATCH v2 2/2] clk: divider: Fix divisions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5e9e776db6628d02d7081292b81ab102.sboyd@kernel.org>
Date:   Mon, 12 Jun 2023 17:41:50 -0700
From:   Stephen Boyd <sboyd@...nel.org>
To:     Michael Turquette <mturquette@...libre.com>,
        Sebastian Reichel <sebastian.reichel@...labora.com>,
        linux-clk@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:     Christopher Obbard <chris.obbard@...labora.com>,
        David Laight <David.Laight@...LAB.COM>,
        Sebastian Reichel <sebastian.reichel@...labora.com>,
        kernel@...labora.com
Subject: Re: [PATCH v2 2/2] clk: divider: Fix divisions

Quoting Sebastian Reichel (2023-05-26 10:10:57)
> The clock framework handles clock rates as "unsigned long", so u32 on
> 32-bit architectures and u64 on 64-bit architectures.
> 
> The current code pointlessly casts the dividend to u64 on 32-bit
> architectures and thus pointlessly reducing the performance.

It looks like that was done to make the DIV_ROUND_UP() macro not
overflow the dividend on 32-bit machines (from 9556f9dad8f5):

  DIV_ROUND_UP(3000000000, 1500000000) = (3.0G + 1.5G - 1) / 1.5G
                                       = OVERFLOW / 1.5G

but I agree, the u64 cast is not necessary if DIV_ROUND_UP_ULL() is
used as that macro casts the dividend to unsigned long long anyway.

> 
> On the other hand on 64-bit architectures the divisor is masked and only
> the lower 32-bit are used. Thus requesting a frequency >= 4.3GHz results
> in incorrect values. For example requesting 4300000000 (4.3 GHz) will
> effectively request ca. 5 MHz.

Nice catch. But I'm concerned that the case above is broken by changing
to DIV_ROUND_UP(). As this code is generic, I fear we'll have to change
this code that divides rates to use DIV64_U64_ROUND_UP() because we
don't know how large the rate is (i.e. it could be larger than 32-bits
on a 64-bit machine).

> Requesting clk_round_rate(clk, ULONG_MAX)
> is a bit of a special case, since that still returns correct values as
> long as the parent clock is below 8.5 GHz.
> 
> Signed-off-by: Sebastian Reichel <sebastian.reichel@...labora.com>
> ---
>  drivers/clk/clk-divider.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
> index a2c2b5203b0a..c38e8aa60e54 100644
> --- a/drivers/clk/clk-divider.c
> +++ b/drivers/clk/clk-divider.c
> @@ -220,7 +220,7 @@ static int _div_round_up(const struct clk_div_table *table,
>                          unsigned long parent_rate, unsigned long rate,
>                          unsigned long flags)
>  {
> -       int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
> +       int div = DIV_ROUND_UP(parent_rate, rate);
>  
>         if (flags & CLK_DIVIDER_POWER_OF_TWO)
>                 div = __roundup_pow_of_two(div);
> @@ -237,7 +237,7 @@ static int _div_round_closest(const struct clk_div_table *table,
>         int up, down;
>         unsigned long up_rate, down_rate;
>  
> -       up = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
> +       up = DIV_ROUND_UP(parent_rate, rate);
>         down = parent_rate / rate;
>  
>         if (flags & CLK_DIVIDER_POWER_OF_TWO) {
> @@ -473,7 +473,7 @@ int divider_get_val(unsigned long rate, unsigned long parent_rate,
>  {
>         unsigned int div, value;
>  
> -       div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
> +       div = DIV_ROUND_UP(parent_rate, rate);
>  
>         if (!_is_valid_div(table, div, flags))
>                 return -EINVAL;

This is undoing parts of commit 9556f9dad8f5 ("clk: divider: handle
integer overflow when dividing large clock rates"). Please pair this
patch with extensive kunit tests in a new test suite clk-divider_test.c
file. I don't know if UML supports changing sizeof(long), but that would
be a cool feature to tease out these sorts of issues. I suppose we'll
just have to run the kunit tests on various architectures to cover the
possibilities.