linux-kernel - Re: [PATCH] tty: serial: qcom-geni-serial: minor fixes to get_clk_div

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <349a6a9b-28a9-e859-0db2-d216553d8ad5@quicinc.com>
Date:   Thu, 9 Jun 2022 23:38:12 +0530
From:   Vijaya Krishna Nivarthi <quic_vnivarth@...cinc.com>
To:     Doug Anderson <dianders@...omium.org>
CC:     Andy Gross <agross@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jiri Slaby <jirislaby@...nel.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        <linux-serial@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>, <quic_msavaliy@...cinc.com>,
        Matthias Kaehlcke <mka@...omium.org>,
        "Stephen Boyd" <swboyd@...omium.org>
Subject: Re: [PATCH] tty: serial: qcom-geni-serial: minor fixes to
 get_clk_div_rate()

Hi,

Re-sending as my earlier message bounced.


On 6/9/2022 4:07 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, Jun 8, 2022 at 11:34 AM Vijaya Krishna Nivarthi
> <quic_vnivarth@...cinc.com> wrote:
>> Hi,
>>
>>
>> On 6/8/2022 12:55 AM, Doug Anderson wrote:
>>> Hi,
>>>
>>> On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi
>>> <quic_vnivarth@...cinc.com> wrote:
>>>> Hi,
>>>>
>>>> On 6/7/2022 1:29 AM, Doug Anderson wrote:
>>>>
>>>> My only concern continues to be...
>>>>
>>>> Given ser_clk is the final frequency that this function is going to
>>>> return and best_div is going to be the clk_divider, is it ok if the
>>>> divider cant divide the frequency exactly?
>>>>
>>>> In other words, Can this function output combinations like (402,4)
>>>> (501,5) ?
>>>>
>>>> If ok, then we can go ahead with this patch or even previous perhaps.
>>> I don't see why not. You're basically just getting a resulting clock
>>> that's not an integral "Hz", right?
>>>
>>> So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600
>>> * 16) = 153600
>>>
>>> Let's imagine that we do all the math and we finally decide that our
>>> best bet is with the rate 922000 and a divider of 6. That means that
>>> the actual clock we'll make is 153666.67 when we _wanted_ 153600.
>>> There's no reason it needs to be integral, though, and 153666.67 would
>>> still be better than making 160000.
>>>
>> Thank you for clarification.
>>>>> power?)
>>>> Actually power saving was the anticipation behind returning first
>>>> frequency in original patch, when we cant find exact frequency.
>>> Right, except that if you just pick the first clock you find it would
>>> be _wildly_ off. I guess if you really want to do this the right way,
>>> you need to set a maximum tolerance and pick the first rate you find
>>> that meets that tolerance. Random web search for "uart baud rate
>>> tolerance" makes me believe that +/- 5% deviation is OK, but to be
>>> safe you probably want something lower. Maybe 2%? So if the desired
>>> clock is within 2% of a clock you can make, can you just pick that
>>> one?
>> Ok, 2% seems good.
>>>>>> Please note that we go past cases when we have an divider that can
>>>>>> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one
>>>>>> that doesn't.
>>>>> Ah, good point. Luckily that's a 1-line fix, right?
>>>> Apologies, I could not figure out how.
>>> Ah, sorry. Not quite 1 line, but this (untested)
>>>
>>>
>>> freq = clk_round_rate(clk, mult);
>>>
>>> if (freq % desired_clk == 0) {
>>>    ser_clk = freq;
>>>    best_div = freq / desired_clk;
>>>    break;
>>> }
>>>
>>> candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk));
>>> candidate_freq = freq / candidate_div;
>>> diff = abs((long)desired_clk - candidate_freq);
>>> if (diff < best_diff) {
>>>     best_diff = diff;
>>>     ser_clk = freq;
>>>     best_div = candidate_div;
>>> }
>> But then once again, we would likely need 2 loops because while we are
>> ok with giving up on search for best_div on finding something within 2%
>> tolerance, we may not want to give up on exact match (freq % desired_clk
>> == 0 )
> Ah, it took me a while to understand why two loops. It's because in
> one case you're trying multiplies and in the other you're bumping up
> to the next closest clock rate. I don't think you really need to do
> that. Just test the (rate - 2%) and the rate. How about this (only
> lightly tested):
>
>      ser_clk = 0;
>      maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT;
>      div = 1;
>      while (div < maxdiv) {
div <= maxdiv ?
>          mult = (unsigned long long)div * desired_clk;
>          if (mult != (unsigned long)mult)
>              break;
>
>          two_percent = mult / 50;
>
>          /*
>           * Loop requesting (freq - 2%) and possibly (freq).
>           *
>           * We'll keep track of the lowest freq inexact match we found
>           * but always try to find a perfect match. NOTE: this algorithm
>           * could miss a slightly better freq if there's more than one
>           * freq between (freq - 2%) and (freq) but (freq) can't be made
>           * exactly, but that's OK.
>           *
>           * This absolutely relies on the fact that the Qualcomm clock
>           * driver always rounds up.
>           */
>          test_freq = mult - two_percent;
>          while (test_freq <= mult) {
>              freq = clk_round_rate(clk, test_freq);
>
>              /*
>               * A dead-on freq is an insta-win. This implicitly
>               * handles when "freq == mult"
>               */
>              if (!(freq % desired_clk)) {
>                  *clk_div = freq / desired_clk;
>                  return freq;
>              }
>
>              /*
>               * Only time clock framework doesn't round up is if
>               * we're past the max clock rate. We're done searching
>               * if that's the case.
>               */
>              if (freq < test_freq)
>                  return ser_clk;
>
>              /* Save the first (lowest freq) within 2% */
>              if (!ser_clk && freq <= mult + two_percent) {
>                  ser_clk = freq;
>                  *clk_div = div;
>              }
My last concern is with search happening only within 2% tolerance.

Do we fail otherwise?

This real case has best tolerance of 1.9%.

[   17.963672] 20220530 desired_clk-51200000
[   21.193550] 20220530 returning ser_clk-52174000, div-1, diff-974000

Seems close.

Thank you.
>
>              /*
>               * If we already rounded up past mult then this will
>               * cause the loop to exit. If not then this will run
>               * the loop a second time with exactly mult.
>               */
>              test_freq = max(freq + 1, mult);
>          }
>
>          /*
>           * test_freq will always be bigger than mult by at least 1.
>           * That means we can get the next divider with a DIV_ROUND_UP.
>           * This has the advantage of skipping by a whole bunch of divs
>           * If the clock framework already bypassed them.
>           */
>          div = DIV_ROUND_UP(test_freq, desired_clk);
>          }
>
>      return ser_clk;