linux-kernel - Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with CLKGEN reset support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <21983f8d-681d-4fed-ae44-42eee44c7f14@samsung.com>
Date: Mon, 7 Apr 2025 19:02:12 +0200
From: Michal Wilczynski <m.wilczynski@...sung.com>
To: Ulf Hansson <ulf.hansson@...aro.org>, Stephen Boyd <sboyd@...nel.org>
Cc: Philipp Zabel <p.zabel@...gutronix.de>, alex@...ti.fr,
	aou@...s.berkeley.edu, conor+dt@...nel.org, drew@...7.com,
	guoren@...nel.org, jszhang@...nel.org, krzk+dt@...nel.org,
	m.szyprowski@...sung.com, mturquette@...libre.com, palmer@...belt.com,
	paul.walmsley@...ive.com, robh@...nel.org, wefu@...hat.com,
	linux-clk@...r.kernel.org, devicetree@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org,
	linux-pm@...r.kernel.org
Subject: Re: [PATCH v1 4/4] clk: thead: Add GPU clock gate control with
 CLKGEN reset support



On 4/1/25 16:38, Michal Wilczynski wrote:
> 
> 
> On 3/26/25 12:24, Ulf Hansson wrote:
>> On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@...nel.org> wrote:
>>>
>>> Quoting Michal Wilczynski (2025-03-19 02:22:11)
>>>>
>>>>
>>>> On 3/13/25 10:25, Philipp Zabel wrote:
>>>>> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote:
>>>>>>
>>>>>> On 3/6/25 00:47, Stephen Boyd wrote:
>>>>>>> Quoting Michal Wilczynski (2025-03-03 06:36:29)
>>>>>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem
>>>>>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are
>>>>>>>> configurable. In order for these clock gates to work properly, the
>>>>>>>> CLKGEN reset must be managed in a specific sequence.
>>>>>>>>
>>>>>>>> Move the CLKGEN reset handling to the clock driver since it's
>>>>>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled
>>>>>>>> GPU clocks stay physically enabled without external interference from
>>>>>>>> the reset driver.  The reset is now deasserted only when both core and
>>>>>>>> cfg clocks are enabled, and asserted when either of them is disabled.
>>>>>>>>
>>>>>>>> The mem clock is configured to use nop operations since it cannot be
>>>>>>>> controlled.
>>>>>>>>
>>>>>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1]
>>>>>>>>
>>>>>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@...sung.com>
>>>>>>> [...]
>>>>>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> index ea96d007aecd..1dfcde867233 100644
>>>>>>>> --- a/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c
>>>>>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0);
>>>>>>> [...]
>>>>>>>>
>>>>>>>>  static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk",
>>>>>>>>                         video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops);
>>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk",
>>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops);
>>>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk",
>>>>>>>> +                       video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops);
>>>>>>>> +
>>>>>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw)
>>>>>>>> +{
>>>>>>>> +       struct ccu_gate *cg = hw_to_ccu_gate(hw);
>>>>>>>> +       unsigned long flags;
>>>>>>>> +
>>>>>>>> +       spin_lock_irqsave(&gpu_reset_lock, flags);
>>>>>>>> +
>>>>>>>> +       ccu_disable_helper(&cg->common, cg->enable);
>>>>>>>> +
>>>>>>>> +       if ((cg == &gpu_core_clk &&
>>>>>>>> +            !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) ||
>>>>>>>> +           (cg == &gpu_cfg_aclk &&
>>>>>>>> +            !clk_hw_is_enabled(&gpu_core_clk.common.hw)))
>>>>>>>> +               reset_control_assert(gpu_reset);
>>>>>>>
>>>>>>> Why can't the clk consumer control the reset itself? Doing this here is
>>>>>>> not ideal because we hold the clk lock when we try to grab the reset
>>>>>>> lock. These are all spinlocks that should be small in lines of code
>>>>>>> where the lock is held, but we're calling into an entire other framework
>>>>>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk
>>>>>>> lock it will deadlock.
>>>>>>
>>>>>> So in our case the clk consumer is the drm/imagination driver. Here is
>>>>>> the comment from the maintainer for my previous attempt to use a reset
>>>>>> driver to abstract the GPU init sequence [1]:
>>>>>>
>>>>>> "Do you know what this resets? From our side, the GPU only has a single
>>>>>> reset line (which I assume to be GPU_RESET)."
>>>>>>
>>>>>> "I don't love that this procedure appears in the platform reset driver.
>>>>>> I appreciate it may not be clear from the SoC TRM, but this is the
>>>>>> standard reset procedure for all IMG Rogue GPUs. The currently
>>>>>> supported TI SoC handles this in silicon, when power up/down requests
>>>>>> are sent so we never needed to encode it in the driver before.
>>>>>>
>>>>>> Strictly speaking, the 32 cycle delay is required between power and
>>>>>> clocks being enabled and the reset line being deasserted. If nothing
>>>>>> here touches power or clocks (which I don't think it should), the delay
>>>>>> could potentially be lifted to the GPU driver."
>>>>>>
>>>>>> From the drm/imagination maintainers point of view their hardware has
>>>>>> only one reset, the extra CLKGEN reset is SoC specific.
>>>>>
>>>>> If I am understanding correctly, the CLKGEN reset doesn't reset
>>>>> anything in the GPU itself, but holds the GPU clock generator block in
>>>>> reset, effectively disabling the three GPU clocks as a workaround for
>>>>> the always-ungated GPU_MEM clock.
>>>>>
>>>>>> Also the reset driver maintainer didn't like my way of abstracting two
>>>>>> resets ("GPU" and and SoC specific"CLKGEN") into one reset
>>>>>
>>>>> That is one part of it. The other is that (according to my
>>>>> understanding as laid out above), the combined GPU+CLKGEN reset would
>>>>> effectively disable all three GPU clocks for a while, after the GPU
>>>>> driver has already requested them to be enabled.
>>>>
>>>> Thank you for your comments Philipp, it seems like we're on the same
>>>> page here. I was wondering whether there is anything I can do to move the
>>>> patches forward.
>>>>
>>>> Stephen, if the current patch is a no go from your perspective could you
>>>> please advise whether there is a way to solve this in a clock that would
>>>> be acceptable to you.
>>>
>>> It looks like the SoC glue makes the interactions between the clk and
>>> reset frameworks complicated because GPU clks don't work if a reset is
>>> asserted. You're trying to find a place to coordinate the clk and reset.
>>> Am I right?
>>>
>>> I'd advise managing the clks and resets in a generic power domain that
>>> is attached to the GPU device. In that power domain, coordinate the clk
>>> and reset sequencing so that the reset is deasserted before the clks are
>>> enabled (or whatever the actual requirement is). If the GPU driver
>>> _must_ have a clk and reset pointer to use, implement one that either
>>> does nothing or flag to the GPU driver that the power domain is managing
>>> all this for it so it should just use runtime PM and system PM hooks to
>>> turn on the clks and take the GPU out of reset.
>>>
>>> From what I can tell, the GPU driver maintainer doesn't want to think
>>> about the wrapper that likely got placed around the hardware block
>>> shipped by IMG. This wrapper is the SoC glue that needs to go into a
>>> generic power domain so that the different PM resources, reset, clk,
>>> etc. can be coordinated based on the GPU device's power state. It's
>>> either that, or go the dwc3 route and have SoC glue platform drivers
>>> that manage this stuff and create a child device to represent the hard
>>> macro shipped by the vendor like Synopsys/Imagination. Doing the parent
>>> device design isn't as flexible as PM domains because you can only have
>>> one parent device and the child device state can be ignored vs. many PM
>>> domains attached in a graph to a device that are more directly
>>> influenced by the device using runtime PM.
>>>
>>> Maybe you'll be heartened to know this problem isn't unique and has
>>> existed for decades :) I don't know what state the graphics driver is in
>>> but they'll likely be interested in solving this problem in a way that
>>> doesn't "pollute" their driver with SoC specific details. It's all a
>>> question of where you put the code. The reset framework wants to focus
>>> on resets, the clk framework wants to focus on clks, and the graphics
>>> driver wants to focus on graphics. BTW, we went through a similar
>>> discussion with regulators and clks years ago and ended up handling that
>>> with OPPs and power domains.
>>
>> Right, power-domain providers are mostly implementing SoC specific code.
>>
>> In some cases, power-domain providers also handle per device SoC
>> specific constraints/sequences, which seems what you are discussing
>> here. For that, genpd has a couple of callbacks that could be
>> interesting to have a look at, such as:
>>
>> genpd->attach|detach_dev() - for probe/remove
>> genpd.dev_ops->start|stop() - for runtime/system PM
>>
>> That said, maybe just using the regular genpd->power_on|off() callback
>> is sufficient here, depending on how you decide to model things.
> 
> 
> Thanks Stephen, Ulf !
> 
> So the way forward I see:
> 
> 1) The reset driver can be merged as-is, if Philipp is fine with this
> code [2].
> 2) I will cook up the update to the thead power-domain driver which will
> handle reset and clock management.

Hi Ulf,
I'm working on the series right now and I wanted to ask you how you
prefer versioning to be handled. Would you like me to send a series as a
v1, or treat is as a continuation of this series [1] and send as a v9.
Would like to avoid any confusion. 

Thanks,
Michał

[1] - https://lore.kernel.org/all/20250311171900.1549916-1-m.wilczynski@samsung.com/

> 3) I think it would be best to convince the drm/imagination maintainers to
> make the clock management in their consumer driver optional. This way if
> there is a SoC specific sequence the clocks/resets will be managed from
> generic PM driver which is SoC specific. Will talk to them.
> 4) Will remove the reset management from this series, and re-send.
> 
> [2] - https://lore.kernel.org/all/4205b786-fb65-468c-a3d8-bce807dd829a@samsung.com/
>>
>>>
>>> I believe a PM domain is the right place for this kind of stuff, and I
>>> actually presented on this topic at OSSEU[1], but I don't maintain that
>>> code. Ulf does.
>>>
>>> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google
> 
> Thanks ! Watched the presentation, very interesting. Hopefully I'll be
> able to attend in person in Amsterdam this year if you're presenting :-)
> 
>>
>> Kind regards
>> Uffe
>>