[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPLW+4=kaSZaMuzAuCRz21qcf2M+1LBrzJULNJaZv6+FQOq4WA@mail.gmail.com>
Date: Wed, 20 Dec 2023 09:12:14 -0600
From: Sam Protsenko <semen.protsenko@...aro.org>
To: Tudor Ambarus <tudor.ambarus@...aro.org>
Cc: peter.griffin@...aro.org, robh+dt@...nel.org,
krzysztof.kozlowski+dt@...aro.org, mturquette@...libre.com, sboyd@...nel.org,
conor+dt@...nel.org, andi.shyti@...nel.org, alim.akhtar@...sung.com,
gregkh@...uxfoundation.org, jirislaby@...nel.org, catalin.marinas@....com,
will@...nel.org, s.nawrocki@...sung.com, tomasz.figa@...il.com,
cw00.choi@...sung.com, arnd@...db.de, andre.draszik@...aro.org,
saravanak@...gle.com, willmcvicker@...gle.com,
linux-arm-kernel@...ts.infradead.org, linux-samsung-soc@...r.kernel.org,
linux-clk@...r.kernel.org, devicetree@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-i2c@...r.kernel.org,
linux-serial@...r.kernel.org
Subject: Re: [PATCH 07/13] clk: samsung: gs101: mark PERIC0 IP TOP gate clock
as critical
On Wed, Dec 20, 2023 at 8:22 AM Tudor Ambarus <tudor.ambarus@...aro.org> wrote:
>
> Hi, Sam!
>
> On 12/19/23 17:31, Sam Protsenko wrote:
> > On Tue, Dec 19, 2023 at 10:47 AM Tudor Ambarus <tudor.ambarus@...aro.org> wrote:
> >>
> >> Hi, Sam!
> >>
> >> On 12/14/23 16:43, Sam Protsenko wrote:
> >>> On Thu, Dec 14, 2023 at 10:15 AM Tudor Ambarus <tudor.ambarus@...aro.org> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 12/14/23 16:09, Sam Protsenko wrote:
> >>>>> On Thu, Dec 14, 2023 at 10:01 AM Tudor Ambarus <tudor.ambarus@...aro.org> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 12/14/23 15:37, Sam Protsenko wrote:
> >>>>>>> On Thu, Dec 14, 2023 at 4:52 AM Tudor Ambarus <tudor.ambarus@...aro.org> wrote:
> >>>>>>>>
> >>>>>>>> Testing USI8 I2C with an eeprom revealed that when the USI8 leaf clock
> >>>>>>>> is disabled it leads to the CMU_TOP PERIC0 IP gate clock disablement,
> >>>>>>>> which then makes the system hang. To prevent this, mark
> >>>>>>>> CLK_GOUT_CMU_PERIC0_IP as critical. Other clocks will be marked
> >>>>>>>> accordingly when tested.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Tudor Ambarus <tudor.ambarus@...aro.org>
> >>>>>>>> ---
> >>>>>>>> drivers/clk/samsung/clk-gs101.c | 2 +-
> >>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/clk/samsung/clk-gs101.c b/drivers/clk/samsung/clk-gs101.c
> >>>>>>>> index 3d194520b05e..08d80fca9cd6 100644
> >>>>>>>> --- a/drivers/clk/samsung/clk-gs101.c
> >>>>>>>> +++ b/drivers/clk/samsung/clk-gs101.c
> >>>>>>>> @@ -1402,7 +1402,7 @@ static const struct samsung_gate_clock cmu_top_gate_clks[] __initconst = {
> >>>>>>>> "mout_cmu_peric0_bus", CLK_CON_GAT_GATE_CLKCMU_PERIC0_BUS,
> >>>>>>>> 21, 0, 0),
> >>>>>>>> GATE(CLK_GOUT_CMU_PERIC0_IP, "gout_cmu_peric0_ip", "mout_cmu_peric0_ip",
> >>>>>>>> - CLK_CON_GAT_GATE_CLKCMU_PERIC0_IP, 21, 0, 0),
> >>>>>>>> + CLK_CON_GAT_GATE_CLKCMU_PERIC0_IP, 21, CLK_IS_CRITICAL, 0),
> >>>>>>>
> >>>>>>> This clock doesn't seem like a leaf clock. It's also not a bus clock.
> >>>>>>> Leaving it always running makes the whole PERIC0 CMU clocked, which
> >>>>>>> usually should be avoided. Is it possible that the system freezes
> >>>>>>> because some other clock (which depends on peric0_ip) gets disabled as
> >>>>>>> a consequence of disabling peric0_ip? Maybe it's some leaf clock which
> >>>>>>> is not implemented yet in the clock driver? Just looks weird to me
> >>>>>>> that the system hangs because of CMU IP clock disablement. It's
> >>>>>>> usually something much more specific.
> >>>>>>
> >>>>>> The system hang happened when I tested USI8 in I2C configuration with an
> >>>>>> eeprom. After the eeprom is read the leaf gate clock that gets disabled
> >>>>>> is the one on PERIC0 (CLK_GOUT_PERIC0_CLK_PERIC0_USI8_USI_CLK). I assume
> >>>>>> this leads to the CMU_TOP gate (CLK_CON_GAT_GATE_CLKCMU_PERIC0_IP)
> >>>>>> disablement which makes the system hang. Either marking the CMU_TOP gate
> >>>>>> clock as critical (as I did in this patch) or marking the leaf PERIC0
> >>>>>> gate clock as critical, gets rid of the system hang. Did I choose wrong?
> >>>>>>
> >>>>>
> >>>>> Did you already implement 100% of clocks in CMU_PERIC0? If no, there
> >>>>
> >>>> yes.
> >>
> >> I checked again all the clocks. I implemented all but one, the one
> >> defined by the CLK_CON_BUF_CLKBUF_PERIC0_IP register. Unfortunately I
> >> don't have any reference on how it should be defined so I won't touch it
> >> yet. But I have some good news too, see below.
> >>
> >>>
> >>> Ok. Are there any other CMUs (perhaps not implemented yet) which
> >>> consume clocks from CMU_PERIC0, specifically PERIC0_IP clock or some
> >>> clocks derived from it? If so, is there a chance some particular leaf
> >>> clock in those CMUs actually renders the system frozen when disabled
> >>> as a consequence of disabling PERIC0_IP, and would explain better why
> >>> the freeze happens?
> >>>
> >>> For now I think it's ok to have that CLK_IS_CRITICAL flag here,
> >>> because as you said you implemented all clocks in this CMU and neither
> >>> of those looks like a critical one. But I'd advice to add a TODO
> >>> comment saying it's probably a temporary solution before actual leaf
> >>> clock which leads to freeze is identified (which probably resides in
> >>> some other not implemented yet CMU).
> >>>
> >>>>
> >>>>> is a chance some other leaf clock (which is not implemented yet in
> >>>>> your driver) gets disabled as a result of PERIC0_IP disablement, which
> >>>>> might actually lead to that hang you observe. Usually it's some
> >>>>> meaningful leaf clock, e.g. GIC or interconnect clocks. Please check
> >>>>> clk-exynos850.c driver for CLK_IS_CRITICAL and CLK_IGNORE_UNUSED flags
> >>>>> and the corresponding comments I left there, maybe it'll give you more
> >>>>> particular idea about what to look for. Yes, making the whole CMU
> >>>>> always running without understanding why (i.e. because of which
> >>>>> particular leaf clock) might not be the best way of handling this
> >>>>
> >>>> because of CLK_GOUT_PERIC0_CLK_PERIC0_USI8_USI_CLK
> >>>
> >>> That's not a root cause here. And I think PERIC0_IP is neither.
> >>>
> >>
> >> you were right!
> >>>>
> >>>>> issue. I might be mistaken, but at least please check if you
> >>>>> implemented all clocks for PERIC0 first and if making some meaningful
> >>>>> leaf clock critical makes more sense.
> >>>>>
> >>
> >> I determined which leaf clocks shall be marked as critical. I enabled
> >> the debugfs clock write access. Then I made sure that the parents of the
> >> PERIC0 CMU have at least one user so that they don't get disabled after
> >> an enable-disable sequence on a leaf clock. The I took all the PERIC0
> >> gate clocks and enabled and disabled them one by one. Whichever hang the
> >> system when the clock was disabled was marked as critical. The list of
> >> critical leaf clocks is as following:
> >>
> >
> > Nice! I used somehow similar procedure for clk-exynos850, doing
> > basically the same thing, but in core clock driver code.
> >
> >> "gout_peric0_peric0_cmu_peric0_pclk",
> >> "gout_peric0_lhm_axi_p_peric0_i_clk",
> >> "gout_peric0_peric0_top1_ipclk_0",
> >> "gout_peric0_peric0_top1_pclk_0".
> >>
> >> I'll update v2 with this instead. Thanks for the help, Sam!
> >
> > Glad you weren't discouraged by my meticulousness :) In clk-exynos850
> > I usually used CLK_IGNORE_UNUSED for clocks like XXX_CMU_XXX (in your
> > case it's PERIC0_CMU_PERIC0), with a corresponding comment. Those
> > clocks usually can be used to disable the bus clock for corresponding
> > CMU IP-core (in your case CMU_PERIC0), which makes it impossible to
> > access the registers from that CMU block, as its register interface is
> > not clocked anymore. Guess I saw something similar in Exynos5433 or
> > Exynos7 clk drivers, or maybe Sylwester or Krzysztof told me to do so
> > -- don't really remember. For AXI clock it also seems logical to keep
> > it running (AXI bus might be used for GIC and memory). But again,
> > maybe CLK_IGNORE_UNUSED flag would be more appropriate that
> > CLK_IS_CRITICAL? For the last two clocks -- it's hard to tell what
> > exactly they do. Is TOP1 some other CMU or block name, and is there
> > any further users for those clocks?
> >
> > Anyways, if you are working on v2, please consider doing next two
> > things while at it:
> >
> > 1. For each critical clock: add corresponding comment explaining why
> > it's marked so
>
> Will do.
>
> > 2. Consider using CLK_IGNORE_UNUSED instead of CLK_IS_CRITICAL when
> > appropriate; both have their use in different cases
> >
> > Btw, if you check other Exynos clk drivers, there is a lot of examples
> > for flags like those.
> >
> Thanks for the feedback, it's educative.
>
> I played a little with the clk debugfs and I think all should be marked
> as critical. What I did was to make sure that their parents are enabled
> already and then I enabled and disabled each. Each time I disabled one
> of them the system hung. Thus in case they will be used, if one disable
> them on an error path, it will hang the system. We can't disable them at
> suspend either. Thus I propose to keep them as critical.
>
Do you see those clocks potentially used by some actual consumers in
future? If no, maybe CLK_IGNORE_UNUSED is enough (just to make sure
the core clock framework won't disable those during the clocks
initialization)? Anyway, I don't have any strong preferences in this
case. If you think CLK_IS_CRITICAL is better in this case, I'd say go
for it.
Also, on a bit different note: please make sure there is no
"clk_ignore_unused" param in your kernel cmdline (e.g. passed from the
bootloader via dts). The clock driver should be functional without
that param. Though it might take some additional work.
> Thanks!
> ta
Powered by blists - more mailing lists