[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <360b8654-01be-4f47-90eb-4fdb2055c653@arm.com>
Date: Fri, 24 Oct 2025 14:02:52 +0100
From: Steven Price <steven.price@....com>
To: Karunika Choo <karunika.choo@....com>, dri-devel@...ts.freedesktop.org
Cc: nd@....com, Boris Brezillon <boris.brezillon@...labora.com>,
Liviu Dudau <liviu.dudau@....com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 06/10] drm/panthor: Implement L2 power on/off via
PWR_CONTROL
On 24/10/2025 12:51, Karunika Choo wrote:
> On 24/10/2025 10:43, Steven Price wrote:
>> On 23/10/2025 23:16, Karunika Choo wrote:
>>> On 20/10/2025 11:50, Steven Price wrote:
>>>> On 14/10/2025 10:43, Karunika Choo wrote:
>>>>> This patch adds common helpers to issue power commands, poll
>>>>> transitions, and validate domain state, then wires them into the L2
>>>>> on/off paths.
>>>>>
>>>>> The L2 power-on sequence now delegates control of the SHADER and TILER
>>>>> domains to the MCU when allowed, while the L2 itself is never delegated.
>>>>> On power-off, dependent domains beneath the L2 are checked, and if
>>>>> necessary, retracted and powered down to maintain proper domain
>>>>> ordering.
>>>>>
>>>>> Signed-off-by: Karunika Choo <karunika.choo@....com>
>>>>> ---
>> [...]
>>>>> + u64 domain_ready = gpu_read64(ptdev, get_domain_ready_reg(child_domain));
>>>>> +
>>>>> + if (domain_ready && (pwr_status & PWR_STATUS_DOMAIN_DELEGATED(child_domain))) {
>>>>> + drm_warn(&ptdev->base,
>>>>> + "L2 power off: Delegated %s domain not powered down by MCU",
>>>>> + get_domain_name(child_domain));
>>>>> + ret = retract_domain(ptdev, child_domain);
>>>>> + if (ret) {
>>>>> + drm_err(&ptdev->base, "Failed to retract %s domain",
>>>>> + get_domain_name(child_domain));
>>>>> + panthor_pwr_info_show(ptdev);
>>>>> + return ret;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + ret = panthor_pwr_domain_power_off(ptdev, child_domain, domain_ready,
>>>>> + PWR_TRANSITION_TIMEOUT_US);
>>>>> + if (ret)
>>>>> + return ret;
>>>>> + }
>>>>> +
>>>>> + return panthor_pwr_domain_power_off(ptdev, PWR_COMMAND_DOMAIN_L2,
>>>>> + ptdev->gpu_info.l2_present,
>>>>> + PWR_TRANSITION_TIMEOUT_US);
>>>>
>>>> Does this implicitly 'retract' the shader/tiler power domains? If so I
>>>> think it's worth a comment. Otherwise it looks like we don't actually
>>>> know the status of whether the shader/tiler power domains are retracted
>>>> or not.
>>>>
>>>
>>> panthor_pwr_l2_power_off() will only retract the shader/tiler domains if
>>> they have not been powered down by the MCU. In cases where the MCU did
>>> power down these child domains, delegate_domain() will exit early as
>>> they would already be delegated. I understand the ambiguity here,
>>> hopefully it is somewhat acceptable.
>>
>> So my question was really how does the driver know whether the domains
>> are delegated or not when this function returns?
>>
>> I couldn't quite get my head around whether turning the L2 power domain
>> off would implicitly 'retract' the shader/tiler power domains -
>> obviously it forces them off which means the MCU doesn't have control.
>> So retracting would make sense, but I couldn't see anything in the spec.
>>
>> It would be good to have a comment explaining what the expected state is
>> after this function (panthor_pwr_l2_power_off) returns. Is it unknown
>> whether the shader/tiler are retracted, or is there something in the
>> hardware which does this automatically so we know but don't have to
>> manually retract? Presumably if we end up fully powering down the GPU
>> that must effectively retract all domains (the GPU hardware is reset so
>> it goes back to reset conditions).
>>
>> Sorry, it's a bit of a basic question but the spec is somewhat unhelpful
>> on this point! (Or at least I haven't found a relevant statement).
>>
>
> Powering off the L2 does not automatically retract its child domains.
> The above case is for handling the edge case where the MCU is hung and
> is not able to power off the delegated domains, therefore the host needs
> to take over and power them down before turning off the L2. Additionally,
> like you have alluded to, powering off the GPU will inevitably reset all
> of these states (retracting the child domains), necessitating a
> re-delegation on L2 power on.
>
> Therefore, the typical operation loop will be as follows:
> 1. L2 power on
> 2. Delegate Tiler/Shader
> <suspend>
> 3. Halt MCU (should power down Tiler/Shader)
> 4. L2 power off (no retract of Tiler/Shader)
> <resume>
> 5. L2 power on (next resume)
> 6. Delegate Tiler/Shader (skipped as already delegated)
>
> If the MCU is hung:
> 1. L2 power on
> 2. Delegate Tiler/Shader
> <suspend>
> 3. Halt MCU fails
> 4. L2 power off (Retract and power off Shader/Tiler)
> <resume>
> 5. L2 power on
> 6. Delegate Tiler/Shader
>
> If the GPU is turned off between suspend and resume:
> 1. L2 power on
> 2. Delegate Tiler/Shader
> <suspend>
> 3. Halt MCU (should power down Tiler/Shader)
> 4. L2 power off (no retract of Tiler/Shader)
> <GPU turned off>
> <resume>
> 6. L2 power on
> 7. Delegate Tiler/Shader
Thanks for the explanation!
>
> With the current implementation, we cannot expect it to be always
> retracted on return of the function, but it does provide the
> additional benefit that on resume we don't need to go through the
> whole delegate cycle after powering up the L2, allowing us to
> save some time there.
>
> On the other hand, if we want to explicitly enforce that we retract on
> suspend, then we have to accept the additional cost to delegate the
> domains on resume.
No, there's no need to change it. But I think it's worth a comment that
in the usual case (the MCU isn't hung) the shader/tiler are left
delegated and the attempt to delegate them again will detect this and
skip it.
I think what mostly confused me is that delegate_domain() has the following:
> + if (pwr_status_reg & delegated_mask) {
> + drm_dbg(&ptdev->base, "%s domain already delegated",
> + get_domain_name(domain));
> + return 0;
> + }
Although it's "drm_dbg" that message makes it seem like this is an
unexpected situation. Whereas actually we would normally expect that to
happen during a resume (as long as the GPU remains powered). With that
in my head I then started to think that there might be something in the
hardware causing an automatic "retract".
Thanks,
Steve
Powered by blists - more mailing lists