lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <34b7ae7d-c4d3-4d94-a1e9-62d3d4fc6b9a@collabora.com>
Date:   Tue, 28 Nov 2023 16:42:25 +0100
From:   AngeloGioacchino Del Regno 
        <angelogioacchino.delregno@...labora.com>
To:     Boris Brezillon <boris.brezillon@...labora.com>
Cc:     robh@...nel.org, steven.price@....com,
        maarten.lankhorst@...ux.intel.com, mripard@...nel.org,
        tzimmermann@...e.de, airlied@...il.com, daniel@...ll.ch,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        kernel@...labora.com, m.szyprowski@...sung.com,
        krzysztof.kozlowski@...aro.org
Subject: Re: [PATCH v2 3/3] drm/panfrost: Synchronize and disable interrupts
 before powering off

Il 28/11/23 16:38, Boris Brezillon ha scritto:
> On Tue, 28 Nov 2023 16:10:43 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
> wrote:
> 
>> Il 28/11/23 15:06, Boris Brezillon ha scritto:
>>> On Tue, 28 Nov 2023 13:45:10 +0100
>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
>>> wrote:
>>>    
>>>> To make sure that we don't unintentionally perform any unclocked and/or
>>>> unpowered R/W operation on GPU registers, before turning off clocks and
>>>> regulators we must make sure that no GPU, JOB or MMU ISR execution is
>>>> pending: doing that required to add a mechanism to synchronize the
>>>> interrupts on suspend.
>>>>
>>>> Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
>>>> interrupts masking and ISR execution synchronization, and then call
>>>> those in the panfrost_device_runtime_suspend() handler in the exact
>>>> sequence of job (may require mmu!) -> mmu -> gpu.
>>>>
>>>> As a side note, JOB and MMU suspend_irq functions needed some special
>>>> treatment: as their interrupt handlers will unmask interrupts, it was
>>>> necessary to add a bitmap for "is_suspending" which is used to address
>>>> the possible corner case of unintentional IRQ unmasking because of ISR
>>>> execution after a call to synchronize_irq().
>>>>
>>>> Of course, unmasking the interrupts is being done as part of the reset
>>>> happening during runtime_resume(): since we're anyway resuming all of
>>>> GPU, JOB, MMU, the only additional action is to zero out the newly
>>>> introduced `is_suspending` bitmap directly in the resume handler, as
>>>> to avoid adding panfrost_{job,mmu}_resume_irq() function just for
>>>> clearing own bits, especially because it currently makes way more sense
>>>> to just zero out the bitmap.
>>>>
>>>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
>>>> ---
>>>>    drivers/gpu/drm/panfrost/panfrost_device.c |  4 ++++
>>>>    drivers/gpu/drm/panfrost/panfrost_device.h |  7 +++++++
>>>>    drivers/gpu/drm/panfrost/panfrost_gpu.c    |  7 +++++++
>>>>    drivers/gpu/drm/panfrost/panfrost_gpu.h    |  1 +
>>>>    drivers/gpu/drm/panfrost/panfrost_job.c    | 18 +++++++++++++++---
>>>>    drivers/gpu/drm/panfrost/panfrost_job.h    |  1 +
>>>>    drivers/gpu/drm/panfrost/panfrost_mmu.c    | 17 ++++++++++++++---
>>>>    drivers/gpu/drm/panfrost/panfrost_mmu.h    |  1 +
>>>>    8 files changed, 50 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
>>>> index c90ad5ee34e7..ed34aa55a7da 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
>>>> @@ -407,6 +407,7 @@ static int panfrost_device_runtime_resume(struct device *dev)
>>>>    {
>>>>    	struct panfrost_device *pfdev = dev_get_drvdata(dev);
>>>>    
>>>> +	bitmap_zero(pfdev->is_suspending, PANFROST_COMP_BIT_MAX);
>>>
>>> I would let each sub-block clear their bit in the reset path, since
>>> that's where the IRQs are effectively unmasked.
>>>    
>>
>>
>> Honestly I wouldn't like seeing that: the reason is that this is something that
>> is done *for* suspend/resume and only for that, while reset may be called out of
>> the suspend/resume handlers.
>>
>> I find clearing the suspend bits in the HW reset path a bit confusing, especially
>> when it is possible to avoid doing it there...
> 
> Well, I do think it's preferable to keep the irq_is_no_longer_suspended
> state update where the interrupt is effectively unmasked. Note that
> when you do a reset, the IRQ is silently suspended just after the
> reset happens, because the xxx_INT_MASKs are restored to their default
> value, so I do consider that clearing this bit in the reset path makes
> sense.
> 

Okay then, I can move it, no problem.

>>
>>>>    	panfrost_device_reset(pfdev);
>>>>    	panfrost_devfreq_resume(pfdev);
>>>>    
>>>> @@ -421,6 +422,9 @@ static int panfrost_device_runtime_suspend(struct device *dev)
>>>>    		return -EBUSY;
>>>>    
>>>>    	panfrost_devfreq_suspend(pfdev);
>>>> +	panfrost_job_suspend_irq(pfdev);
>>>> +	panfrost_mmu_suspend_irq(pfdev);
>>>> +	panfrost_gpu_suspend_irq(pfdev);
>>>>    	panfrost_gpu_power_off(pfdev);
>>>>    
>>>>    	return 0;
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> index 54a8aad54259..29f89f2d3679 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> @@ -25,6 +25,12 @@ struct panfrost_perfcnt;
>>>>    #define NUM_JOB_SLOTS 3
>>>>    #define MAX_PM_DOMAINS 5
>>>>    
>>>> +enum panfrost_drv_comp_bits {
>>>> +	PANFROST_COMP_BIT_MMU,
>>>> +	PANFROST_COMP_BIT_JOB,
>>>> +	PANFROST_COMP_BIT_MAX
>>>> +};
>>>> +
>>>>    /**
>>>>     * enum panfrost_gpu_pm - Supported kernel power management features
>>>>     * @GPU_PM_CLK_DIS:  Allow disabling clocks during system suspend
>>>> @@ -109,6 +115,7 @@ struct panfrost_device {
>>>>    
>>>>    	struct panfrost_features features;
>>>>    	const struct panfrost_compatible *comp;
>>>> +	DECLARE_BITMAP(is_suspending, PANFROST_COMP_BIT_MAX);
>>>
>>> nit: Maybe s/is_suspending/suspended_irqs/, given the state remains
>>> until the device is resumed.
>>
>> If we keep the `is_suspending` name, we can use this one more generically in
>> case we ever need to, what do you think?
> 
> I'm lost. Why would we want to reserve a name for something we don't
> know about? My comment was mostly relating to the fact this bitmap
> doesn't reflect the is_suspending state, but rather is_suspended,
> because it remains set until the device is resumed. And we actually want
> it to reflect the is_suspended state, so we can catch interrupts that
> are not for us without reading regs in the hard irq handler, when the
> GPU is suspended.

`is_suspended` (fun story: that's the first name I gave it) looks good to me,
the doubt I raised was about calling it `suspended_irqs` instead, as I would
prefer to keep names "more generic", but that's just personal preference at
this point anyway.

Cheers,
Angelo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ