[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <927c9a00-f825-439b-bc78-03211fed4da9@arm.com>
Date: Thu, 31 Jul 2025 14:34:32 +0100
From: Steven Price <steven.price@....com>
To: Karunika Choo <karunika.choo@....com>, dri-devel@...ts.freedesktop.org
Cc: nd@....com, Boris Brezillon <boris.brezillon@...labora.com>,
Liviu Dudau <liviu.dudau@....com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
linux-kernel@...r.kernel.org, Dennis Tsiang <dennis.tsiang@....com>
Subject: Re: [PATCH v1] drm/panthor: Serialize GPU cache flush operations
On 31/07/2025 13:48, Karunika Choo wrote:
> On 31/07/2025 11:57, Steven Price wrote:
>> On 30/07/2025 18:43, Karunika Choo wrote:
>>> In certain scenarios, it is possible for multiple cache flushes to be
>>> requested before the previous one completes. This patch introduces the
>>> cache_flush_lock mutex to serialize these operations and ensure that
>>> any requested cache flushes are completed instead of dropped.
>>>
>>> Signed-off-by: Karunika Choo <karunika.choo@....com>
>>> Co-developed-by: Dennis Tsiang <dennis.tsiang@....com>
>>
>> A Co-Developed-By needs to have a signed-off-by too[1]
>
> Oops. I can push a v2 to add those.
>
>>
>> [1]
>> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by
>>
>> But I also don't understand how this is happening. The only caller to
>> panthor_gpu_flush_caches() is in panthor_sched_suspend() and that is
>> holding the sched->lock mutex.
>
> The fix is in relation to the enablement of GPU Flush caches by default
> for all GPUs [1]. While calls from the MMU are serialized, other calls
> i.e. from panthor_sched_suspend() are not. As such, this patch
> explicitly serializes these operations.
Ah, ok so this is effectively a bug fix for that patch - given we've not
yet merged that series can we just do a v9 of the series with the fix
rolled in? (Rather than having a commit or two where we know the bug is
present).
I have to admit it also feels like we should have something to avoid
doing excessive cache flushes - there's no point in queuing up multiple
flushes back-to-back. But I don't have a neat solution, and I'm not sure
whether this will happen often enough to worry about. So I guess we
should probably ignore it until/unless it becomes a problem.
Steve
> [1]
> https://lore.kernel.org/all/20250724124210.3675094-6-karunika.choo@arm.com/
>
> Kind regards,
> Karunika Choo
>
>> Steve
>>
>>> ---
>>> drivers/gpu/drm/panthor/panthor_gpu.c | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
>>> index cb7a335e07d7..030409371037 100644
>>> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
>>> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
>>> @@ -35,6 +35,9 @@ struct panthor_gpu {
>>>
>>> /** @reqs_acked: GPU request wait queue. */
>>> wait_queue_head_t reqs_acked;
>>> +
>>> + /** @cache_flush_lock: Lock to serialize cache flushes */
>>> + struct mutex cache_flush_lock;
>>> };
>>>
>>> /**
>>> @@ -204,6 +207,7 @@ int panthor_gpu_init(struct panthor_device *ptdev)
>>>
>>> spin_lock_init(&gpu->reqs_lock);
>>> init_waitqueue_head(&gpu->reqs_acked);
>>> + mutex_init(&gpu->cache_flush_lock);
>>> ptdev->gpu = gpu;
>>> panthor_gpu_init_info(ptdev);
>>>
>>> @@ -353,6 +357,9 @@ int panthor_gpu_flush_caches(struct panthor_device *ptdev,
>>> bool timedout = false;
>>> unsigned long flags;
>>>
>>> + /* Serialize cache flush operations. */
>>> + guard(mutex)(&ptdev->gpu->cache_flush_lock);
>>> +
>>> spin_lock_irqsave(&ptdev->gpu->reqs_lock, flags);
>>> if (!drm_WARN_ON(&ptdev->base,
>>> ptdev->gpu->pending_reqs & GPU_IRQ_CLEAN_CACHES_COMPLETED)) {
>>
>
Powered by blists - more mailing lists