linux-kernel - Re: [PATCH] drm/panfrost: Ignore core_mask for poweroff and sync interrupts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a3ae6eb5-902b-4e33-9b14-a9dc3cc5056d@collabora.com>
Date:   Thu, 23 Nov 2023 12:43:31 +0100
From:   AngeloGioacchino Del Regno 
        <angelogioacchino.delregno@...labora.com>
To:     Boris Brezillon <boris.brezillon@...labora.com>
Cc:     steven.price@....com, robh@...nel.org,
        maarten.lankhorst@...ux.intel.com, mripard@...nel.org,
        tzimmermann@...e.de, airlied@...il.com, daniel@...ll.ch,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        krzysztof.kozlowski@...aro.org, kernel@...labora.com
Subject: Re: [PATCH] drm/panfrost: Ignore core_mask for poweroff and sync
 interrupts

Il 23/11/23 12:15, AngeloGioacchino Del Regno ha scritto:
> Il 23/11/23 11:35, Boris Brezillon ha scritto:
>> On Thu, 23 Nov 2023 10:53:20 +0100
>> AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
>> wrote:
>>
>>> Some SoCs may be equipped with a GPU containing two core groups
>>> and this is exactly the case of Samsung's Exynos 5422 featuring
>>> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
>>> is partial, as this driver currently supports using only one
>>> core group and that's reflected on all parts of it, including
>>> the power on (and power off, previously to this patch) function.
>>>
>>> The issue with this is that even though executing the soft reset
>>> operation should power off all cores unconditionally, on at least
>>> one platform we're seeing a crash that seems to be happening due
>>> to an interrupt firing which may be because we are calling power
>>> transition only on the first core group, leaving the second one
>>> unchanged, or because ISR execution was pending before entering
>>> the panfrost_gpu_power_off() function and executed after powering
>>> off the GPU cores, or all of the above.
>>>
>>> Finally, solve this by changing the power off flow to
>>>   1. Mask and clear all interrupts: we don't need nor want any, as
>>>      we are polling PWRTRANS anyway;
>>>   2. Call synchronize_irq() after that to make sure that any pending
>>>      ISR is executed before powering off the GPU Shaders/Tilers/L2
>>>      hence avoiding unpowered registers R/W; and
>>>   3. Ignore the core_mask and ask the GPU to poweroff both core groups
>>
>> Could we split that in two patches? 1+2 in one patch, and 3 in another.
>> These are two orthogonal fixes IMO.
>>
> 
> My initial idea was exactly that, but I opted for one patch doing 'em all
> because a "full fix" comprises all of 1+2+3: the third one without the
> first two and vice-versa may not fully resolve the issue that was seen
> on the HC1 board.
> 
> So, while I agree that it'd be slightly more readable as a diff if those
> were two different commits I do have reasons against splitting.....
> 
>>>
>>> Of course it was also necessary to add a `irq` variable to `struct
>>> panfrost_device` as we need to get that in panfrost_gpu_power_off()
>>> for calling synchronize_irq() on it.
>>>
>>> Fixes: 123b431f8a5c ("drm/panfrost: Really power off GPU cores in 
>>> panfrost_gpu_power_off()")
>>> [Regression detected on Odroid HC1, Exynos 5422, Mali-T628 MP6]
>>> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>
>>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
>>> ---
>>>   drivers/gpu/drm/panfrost/panfrost_device.h |  1 +
>>>   drivers/gpu/drm/panfrost/panfrost_gpu.c    | 26 +++++++++++++++-------
>>>   2 files changed, 19 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
>>> b/drivers/gpu/drm/panfrost/panfrost_device.h
>>> index 0fc558db6bfd..b4feaa99e34f 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>>> @@ -94,6 +94,7 @@ struct panfrost_device {
>>>       struct device *dev;
>>>       struct drm_device *ddev;
>>>       struct platform_device *pdev;
>>> +    int irq;
>>
>> I know it's the only irq being stored at the panfrost_device level, but
>> I think it's clearer if we explicitly prefix it with gpu_.
>>
> 
> Makes sense, agreed.
> 
>>>       void __iomem *iomem;
>>>       struct clk *clock;
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c 
>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> index 1cc55fb9c45b..30b395125155 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> @@ -425,11 +425,21 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>>>   void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>   {
>>> -    u64 core_mask = panfrost_get_core_mask(pfdev);
>>>       int ret;
>>>       u32 val;
>>> -    gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & 
>>> core_mask);
>>> +    /* We are polling PWRTRANS and we don't need nor want interrupts */
>>
>> I kinda agree with that, but that's not exactly why we're
>> masking+syncing IRQs here. If that was the only reason, the right fix
>> would be to modify GPU_IRQ_MASK_ALL so it doesn't include the PWRTRANS
>> bits.
>>
>> This fix should cover more than just the power transition IRQ use case:
>> we want all IRQs to be disabled before the device is suspended.
>>
> 
> Eh I should reword that, effectively, because what I wrote as comments make
> sense (as in, the logic of it completes) only if you read both of them "as one".
> 
> I'll do that in the new suspend irq helper :-)
> 
>>> +    gpu_write(pfdev, GPU_INT_MASK, 0);
>>> +    gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
>>> +
>>> +    /*
>>> +     * Make sure that we don't have pending ISRs, otherwise we'll be
>>> +     * reading and/or writing registers while the GPU is powered off
>>> +     */
>>> +    synchronize_irq(pfdev->irq);
>>
>> Could we move that to a panfrost_gpu_suspend_irq() helper? I'm also not
>> sure making it part of panfrost_gpu_power_off() is a good idea. I'd
>> rather have this panfrost_gpu_suspend_irq() helper called from
>> panfrost_device_[runtime_]suspend(), along with
>> panfrost_{mmu,job}_suspend_irq().
>>
> 
> Okay I will move that to a helper, but I still want to clarify:
>   - For JOB, we're checking if panfrost_job_is_idle() before trying
>     to runtime_suspend() (hence before trying to power off cores),
>     so implicitly no interrupt can fire I guess? Though there could
>     still be a pending ISR there too.... mmh. Brain ticking :-)
>   - For MMU, we're not checking anything, but I guess that if there
>     is no job, the mmu can't be doing anything at all?
>     ...but then you also gave me the doubt about that one as well.
> 
> What I think that would be sensible to do is to get this commit as
> a "clear" fix for the "Really power off" one, then have one or more
> additional commit(s) without any fixes tag to improve the IRQ suspend
> with the new mmu/job irq suspend helpers.
> 
> Of course *this* commit would introduce the panfrost_gpu_suspend_irq()
> helper directly, instead of moving the logic to a helper in a later one.
> 
> Any reason against? :-)
> 
>>> +
>>> +    /* Now it's safe to request poweroff for Shaders, Tilers and L2 */
>>
>> It was safe before too, it's just that we probably don't want to be
> 
> In theory yes, in practice the Odroid HC1 board crashed :-P
> 
> P.S.: Joking! I understand what you're saying :-)
> 
>> interrupted, if all we do is ignore the interrupts we receive, hence
>> the suggestion to not use GPU_IRQ_MASK_ALL, and only enable the
>> IRQs we care about instead.
>>
>>> +    gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>>>       ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>>>                        val, !val, 1, 1000);
>>>       if (ret)
>>> @@ -441,7 +451,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>       if (ret)
>>>           dev_err(pfdev->dev, "tiler power transition timeout");
>>> -    gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
>>> +    gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>>
>> I really think we should have a helper doing the 'write(PWROFF_{LO,HI} +
>> poll(PWRTRANS_{LO,HI})' sequence, similar to what's done here [1][2],
>> such that, once we got it right for one block, we have it working for
>> all of them. And if there's a fix to apply, it automatically applies
>> to all blocks without having to fix the same bug in each copy.
>>
> 
> ...technically yes, but practically this driver doesn't currently support any
> GPU that even fills the _LO registers.
> 
> I guess that we can do that later?
> 
> That's just to (paranoidly) avoid any risk to introduce other regressions in
> this merge window, since we're fixing one that shouldn't have happened in the
> first place...
> 
>> Note that these panthor_gpu_block_power_{on,off}() helpers also handle
>> the case where power transitions are already in progress when you ask a
>> new power transition, which I don't think is checked in
>> panfrost_gpu_power_{off,on}().
>>
> 
> I admit I didn't analyze the panthor driver - but here, the only power transitions
> that may happen are either because of panfrost_gpu_power_on(), or because of
> panfrost_gpu_power_off(), and both are polling and blocking... so from what I
> understand, there's no possibility to have "another" power transition happening
> while calling poweron, or poweroff.
> 
> That would change if we start to selectively turn on and off a number of shaders
> and/or a number of tilers (not all of them) depending on the workload, but we're
> not doing that...
> 
> ...yet?
> 
> :-)
> 
>>>       ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>>>                        val, !val, 0, 1000);
>>
>> Not introduced by the patch, but I noticed args passed on the second
>> line are no longer aligned on the open parens.
>>
> 
> Yeah, fixing that for v2 :-)
> 

Sorry for the double reply - I just noticed that you're seeing this misalignment
because I had a *local* commit introducing that but, on linux-next, this is not
present, so there's no misalignment to fix........ :-)

Cheers,
Angelo