linux-kernel - Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFqsJaTrF0tBSY-TjpqdVt5=6aPQHYfnDebtphfRZSU=-Q@mail.gmail.com>
Date: Wed, 26 Mar 2025 12:24:22 +0100
From: Ulf Hansson <ulf.hansson@...aro.org>
To: Stephen Boyd <sboyd@...nel.org>
Cc: Michal Wilczynski <m.wilczynski@...sung.com>, Philipp Zabel <p.zabel@...gutronix.de>, alex@...ti.fr, 
	aou@...s.berkeley.edu, conor+dt@...nel.org, drew@...7.com, guoren@...nel.org, 
	jszhang@...nel.org, krzk+dt@...nel.org, m.szyprowski@...sung.com, 
	mturquette@...libre.com, palmer@...belt.com, paul.walmsley@...ive.com, 
	robh@...nel.org, wefu@...hat.com, linux-clk@...r.kernel.org, 
	devicetree@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-riscv@...ts.infradead.org, linux-pm@...r.kernel.org
Subject: Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN
 reset support

On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@...nel.org> wrote:
>
> Quoting Michal Wilczynski (2025-03-19 02:22:11)
> >
> >
> > On 3/13/25 10:25, Philipp Zabel wrote:
> > > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
> > >>
> > >> On 3/6/25 00:47, Stephen Boyd wrote:
> > >>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
> > >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
> > >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
> > >>>> configurable. In order for these clock gates to work properly, the
> > >>>> CLKGEN reset must be managed in a specific sequence.
> > >>>>
> > >>>> Move the CLKGEN reset handling to the clock driver since it's
> > >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
> > >>>> GPU clocks stay physically enabled without external interference from
> > >>>> the reset driver.  The reset is now deasserted only when both core and
> > >>>> cfg clocks are enabled, and asserted when either of them is disabled.
> > >>>>
> > >>>> The mem clock is configured to use nop operations since it cannot be
> > >>>> controlled.
> > >>>>
> > >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
> > >>>>
> > >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@...sung.com>
> > >>> [...]
> > >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
> > >>>> index ea96d007aecd..1dfcde867233 100644
> > >>>> --- a/drivers/clk/thead/clk-th1520-ap.c
> > >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
> > >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
> > >>> [...]
> > >>>>
> > >>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
> > >>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
> > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
> > >>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
> > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
> > >>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
> > >>>> +
> > >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
> > >>>> +{
> > >>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
> > >>>> +       unsigned long flags;
> > >>>> +
> > >>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
> > >>>> +
> > >>>> +       ccu_disable_helper(&cg->common, cg->enable);
> > >>>> +
> > >>>> +       if ((cg == &gpu_core_clk &&
> > >>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
> > >>>> +           (cg == &gpu_cfg_aclk &&
> > >>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
> > >>>> +               reset_control_assert(gpu_reset);
> > >>>
> > >>> Why can't the clk consumer control the reset itself? Doing this here is
> > >>> not ideal because we hold the clk lock when we try to grab the reset
> > >>> lock. These are all spinlocks that should be small in lines of code
> > >>> where the lock is held, but we're calling into an entire other framework
> > >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
> > >>> lock it will deadlock.
> > >>
> > >> So in our case the clk consumer is the drm/imagination driver. Here is
> > >> the comment from the maintainer for my previous attempt to use a reset
> > >> driver to abstract the GPU init sequence [1]:
> > >>
> > >> "Do you know what this resets? From our side, the GPU only has a single
> > >> reset line (which I assume to be GPU_RESET)."
> > >>
> > >> "I don't love that this procedure appears in the platform reset driver.
> > >> I appreciate it may not be clear from the SoC TRM, but this is the
> > >> standard reset procedure for all IMG Rogue GPUs. The currently
> > >> supported TI SoC handles this in silicon, when power up/down requests
> > >> are sent so we never needed to encode it in the driver before.
> > >>
> > >> Strictly speaking, the 32 cycle delay is required between power and
> > >> clocks being enabled and the reset line being deasserted. If nothing
> > >> here touches power or clocks (which I don't think it should), the delay
> > >> could potentially be lifted to the GPU driver."
> > >>
> > >> From the drm/imagination maintainers point of view their hardware has
> > >> only one reset, the extra CLKGEN reset is SoC specific.
> > >
> > > If I am understanding correctly, the CLKGEN reset doesn't reset
> > > anything in the GPU itself, but holds the GPU clock generator block in
> > > reset, effectively disabling the three GPU clocks as a workaround for
> > > the always-ungated GPU_MEM clock.
> > >
> > >> Also the reset driver maintainer didn't like my way of abstracting two
> > >> resets ("GPU" and and SoC specific"CLKGEN") into one reset
> > >
> > > That is one part of it. The other is that (according to my
> > > understanding as laid out above), the combined GPU+CLKGEN reset would
> > > effectively disable all three GPU clocks for a while, after the GPU
> > > driver has already requested them to be enabled.
> >
> > Thank you for your comments Philipp, it seems like we're on the same
> > page here. I was wondering whether there is anything I can do to move the
> > patches forward.
> >
> > Stephen, if the current patch is a no go from your perspective could you
> > please advise whether there is a way to solve this in a clock that would
> > be acceptable to you.
>
> It looks like the SoC glue makes the interactions between the clk and
> reset frameworks complicated because GPU clks don't work if a reset is
> asserted. You're trying to find a place to coordinate the clk and reset.
> Am I right?
>
> I'd advise managing the clks and resets in a generic power domain that
> is attached to the GPU device. In that power domain, coordinate the clk
> and reset sequencing so that the reset is deasserted before the clks are
> enabled (or whatever the actual requirement is). If the GPU driver
> _must_ have a clk and reset pointer to use, implement one that either
> does nothing or flag to the GPU driver that the power domain is managing
> all this for it so it should just use runtime PM and system PM hooks to
> turn on the clks and take the GPU out of reset.
>
> From what I can tell, the GPU driver maintainer doesn't want to think
> about the wrapper that likely got placed around the hardware block
> shipped by IMG. This wrapper is the SoC glue that needs to go into a
> generic power domain so that the different PM resources, reset, clk,
> etc. can be coordinated based on the GPU device's power state. It's
> either that, or go the dwc3 route and have SoC glue platform drivers
> that manage this stuff and create a child device to represent the hard
> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
> device design isn't as flexible as PM domains because you can only have
> one parent device and the child device state can be ignored vs. many PM
> domains attached in a graph to a device that are more directly
> influenced by the device using runtime PM.
>
> Maybe you'll be heartened to know this problem isn't unique and has
> existed for decades :) I don't know what state the graphics driver is in
> but they'll likely be interested in solving this problem in a way that
> doesn't "pollute" their driver with SoC specific details. It's all a
> question of where you put the code. The reset framework wants to focus
> on resets, the clk framework wants to focus on clks, and the graphics
> driver wants to focus on graphics. BTW, we went through a similar
> discussion with regulators and clks years ago and ended up handling that
> with OPPs and power domains.

Right, power-domain providers are mostly implementing SoC specific code.

In some cases, power-domain providers also handle per device SoC
specific constraints/sequences, which seems what you are discussing
here. For that, genpd has a couple of callbacks that could be
interesting to have a look at, such as:

genpd->attach|detach_dev() - for probe/remove
genpd.dev_ops->start|stop() - for runtime/system PM

That said, maybe just using the regular genpd->power_on|off() callback
is sufficient here, depending on how you decide to model things.

>
> I believe a PM domain is the right place for this kind of stuff, and I
> actually presented on this topic at OSSEU[1], but I don't maintain that
> code. Ulf does.
>
> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google

Kind regards
Uffe