linux-kernel - Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3cd7393a34ca991184722e57b6c64737973b31c4.camel@nxp.com>
Date:   Thu, 22 Jul 2021 17:08:52 +0800
From:   Liu Ying <victor.liu@....com>
To:     Andy Shevchenko <andy.shevchenko@...il.com>
Cc:     Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Heiko Stuebner <heiko@...ech.de>,
        Elaine Zhang <zhangqing@...k-chips.com>,
        Stephen Boyd <sboyd@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-clk <linux-clk@...r.kernel.org>,
        linux-arm Mailing List <linux-arm-kernel@...ts.infradead.org>,
        "open list:ARM/Rockchip SoC..." <linux-rockchip@...ts.infradead.org>,
        Michael Turquette <mturquette@...libre.com>,
        NXP Linux Team <linux-imx@....com>,
        Jacky Bai <ping.bai@....com>
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER
 flag

On Thu, 2021-07-22 at 10:24 +0300, Andy Shevchenko wrote:
> On Thu, Jul 22, 2021 at 9:04 AM Liu Ying <victor.liu@....com> wrote:
> > On Mon, 2021-07-19 at 15:09 +0300, Andy Shevchenko wrote:
> > > On Mon, Jul 19, 2021 at 11:16:07AM +0800, Liu Ying wrote:
> > > > On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> > > > > On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > > > > > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:
> > > > > > Second and more important, it seems that it would be good to decouple
> > > > > > the prescaler knowledge from this fractional divider clk driver so as
> > > > > > to make it simple(Output rate = (m / n) * parent_rate).  This way, the
> > > > > > CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
> > > > > > place, which means rational_best_approximation() just _directly_
> > > > > > offer best_{numerator,denominator} for all cases.
> > > > > 
> > > > > Feel free to submit a patch, just give a good test to avoid breakage of almost
> > > > > all users of this driver.
> > > > 
> > > > Maybe someone may do that.
> > > 
> > > Perhaps. The idea per se is good I think, but I doubt that the implementation
> > > will be plausible.
> > > 
> > > > I just shared my thought that it sounds
> > > > like a good idea
> > > 
> > > Thanks!
> > > 
> > > > to decouple the prescaler knowledge from this
> > > > fractional divider clk driver.
> > > 
> > > Are you suggesting that each of the device that has _private_ pre-scaler has to
> > > be a clock provider at the same time?
> > 
> > Maybe it depends on specific devices.  But, if a device is designed to
> > dedicatedly control clocks, being a clock provider seems to be
> > intuitive.
> 
> OK.
> 
> > > OTOH you will probably need irrespresentable hierarchy to avoid saturated values.
> > > 
> > > At least those two issues I believe makes the idea fade in complications of the
> > > actual implementation. But again, send the code (you or anybody else) and we will
> > > see how it looks like.
> > 
> > Aside from making this fractional divider clk driver simple, there
> > seems to be another reason for decoupling the prescaler knowledge from
> > the driver.  That is, the 'left shifting' done in
> > clk_fd_general_approximation()/clk_fd_round_rate() is likely to cause
> > mis-match bewteen 'rate = clk_round_rate(clk, r);' and
> > 'clk_set_rate(clk, r); rate = clk_get_rate(clk);' as kerneldoc
> > of clk_round_rate() mentions that they are kinda equivalent
> > in include/linux/clk.h. clk_fd_set_rate() doesn't really contain the
> > 'left shifting'.
> > 
> > So, it looks like decoupling is the right way to go.
> 
> OK.
> 
> ...
> 
> > > > > > Further more, is it
> > > > > > possilbe for rational_best_approximation() to make sure there is no
> > > > > > risk of overflow for best_{numerator,denominator}, since
> > > > > > max_{numerator,denominator} are already handed over to
> > > > > > rational_best_approximation()?
> > > > > 
> > > > > How? It can not be satisfied for all possible inputs.
> > > > 
> > > > Just have rational_best_approximation() make sure
> > > > best_{numerator,denominator} are in the range of
> > > > [1, max_{numerator,denominator}] for all given_{numerator,denominator}.
> > > > At the same time, best_numerator/best_denominator should be as close
> > > > to given_numerator/given_denominator as possible. For this particular
> > > > fractional divider clk use case, clk_round_rate() can be called
> > > > multiple times until users find rounded rate is ok.
> > > 
> > > How is it supposed to work IRL? E.g. this driver is being used for UART. Serial
> > 
> > I guess the drivers are drivers/acpi/acpi_lpss.c and drivers/mfd/intel-
> > lpss.c? Both for Intel.
> 
> At least those I have knowledge of. Others, if any, seem to have taken
> this into account.
> 
> > > core (or even TTY) has a specific function to approximate the baud rate and it
> > > tries it 2 or 3 times. In case of *saturated* values it won't progress anyhow
> > > because from best rational approximation algorithm the very first attempt would
> > > be done against the best possible clock rate.
> > > 
> > > Can you provide some code skeleton to see?
> > 
> > Perhaps, two approaches can be taken in driver which uses the
> > fractional divider clock:
> > 1) Tune prescaler to generate higher rate or lower rate accordingly
> > when clk_round_rate() for the fractional divider clock returns lower or
> > higher rates then desired rate. This might take several rounds until
> > desired rate is satisfied w/wo a tolerated bias.
> > 2) Put working clock rates and/or parent clock rates in a table as sort
> > of prior knowledge, which means less code for rate negotiation.
> 
> Often 2) is a bad idea which I'm against from day 1. I prefer to
> calculate what can be calculated.
> The 1) looks better but requires several (unnecessary IIRC) rounds.
> Why not supply the additional parameter(s) to tell that we have a
> prescaller with certain limitations?

To me, it's kinda too much information to this common frational divider
clk driver.  Making the common driver simple and easy to maintain is
important.

> 
> ...
> 
> > > > > > Overflowed/unreasonable
> > > > > > best_{numerator,denominator} don't sound like the "best" offered value.
> > > > > 
> > > > > I don't follow here. If you got saturated values it means that your input is
> > > > > not convergent. In practice it means that we will supply quite a bad value to
> > > > > the caller.
> > > > 
> > > > Just like I mentioned above, if given_{numerator,denominator} are not
> > > > convergent, best_numerator/best_denominator should be as close
> > > > to given_numerator/given_denominator as possible and at the same time
> > > > best_{numerator,denominator} are in the range of
> > > > [1, max_{numerator,denominator}].  This way, caller may have chance to
> > > > propose convergent inputs.
> > > 
> > > How? Again, provide some code to understand this better.
> > > (Spoiler: arithmetics won't allow you to do this. Or maybe
> > >  I'm badly missing something very simple and obvious...)
> > > 
> > > And, if it's possible to achieve, are you suggesting that part of
> > > what CCF driver should do the users will have been doing by their
> > > own?
> > 
> > Well, I just think it doesn't seem to be necessary for the CCF/common
> > frational drivider clk driver to have the prescaler knowledge. The
> > prescaler knowledge can be in a dedicated clk provider(if appropriate)
> > or somewhere else.
> 
> I might disagree on the grounds of the HW hierarchy and the best that
> we may achieve in _one_ pass. For example, for a 16-bit additional
> prescaler it will require up to 16 steps to get the best possible

Would that be an unacceptable performance penalty?

> values for the m/n. Instead we may supply to this driver the
> information about subordinate prescaler and get the best m/n. The
> caller will need to just divide the resulting rate by the asked rate
> to get a prescaler value.

IMHO, a simpler fractional divider clk driver without the prescaler
knowledge wins the tradeoff.

> 
> ...
> 
> > > TL;DR: please send a code to discuss.
> > 
> > It seems that you have some experience on those intel drivers, this
> > clock driver and rational algorithm driver and you probably have intel
> > HWs to test.  May I encourage you to look into this and decouple the
> > prescaler knowledge out :-)
> > 
> > > Thanks for review and you review of v2 is warmly welcomed!
> > 
> > I'd like to see patches to decouple the prescaler knowledge out.
> 
> Then produce them! Currently the code works for all its users and does
> not need any changes (documentation is indeed a gap).

IIUC, only the two Intel drivers mentioned before are affected.
Rockchip has it's own ->approximation() callback and i.MX7ulp hasn't
the prescaler(IIUC), thus kinda not affected.  So, perhaps you may help
look into this and decouple the prescaler knowledge out, as it seems
that you have experience on the relevant drivers and HW to test.
Anyway, to me, it is _not_ a must to have if you really think it's hard
to do or unnesessary :-)

> 
> > V2, like v1, tries to consolidate the knowledge in this fractional
> > divider clk driver. So, not the right direction I think.
> 
> Then why are you commenting here and not there? :-)

Maybe v2 was sent too quickly as the decoupling comment on v1 hasn't
been sufficiently discussed :-) I'll comment v2 briefly.

> I think I would drop patch 2 from the set (patch 1 is Acked and patch
> 3 is definitely needed to describe current state of affairs) on the
> grounds of the comments.

Please consider i.MX7ulp, as it hasn't the prescaler IIUC. i.MX7ulp
needs NO_PRESCALER flag, if we keep the prescaler knowledge in this
driver ofc.

Regards,
Liu Ying