[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250617164540.4fb196d4@collabora.com>
Date: Tue, 17 Jun 2025 16:45:40 +0200
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Ashley Smith <ashley.smith@...labora.com>
Cc: Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard
<mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, David Airlie
<airlied@...il.com>, Simona Vetter <simona@...ll.ch>, kernel@...labora.com,
dri-devel@...ts.freedesktop.org (open list:ARM MALI PANTHOR DRM DRIVER),
linux-kernel@...r.kernel.org (open list)
Subject: Re: [PATCH v5 1/2] drm/panthor: Reset queue slots if termination
fails
On Tue, 3 Jun 2025 10:49:31 +0100
Ashley Smith <ashley.smith@...labora.com> wrote:
> This fixes a bug where if we timeout after a suspend and the termination
> fails, due to waiting on a fence that will never be signalled for
> example, we do not resume the group correctly. The fix forces a reset
> for groups that are not terminated correctly.
>
> Signed-off-by: Ashley Smith <ashley.smith@...labora.com>
> Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
> ---
> drivers/gpu/drm/panthor/panthor_sched.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index 43ee57728de5..65d8ae3dcac1 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -2727,8 +2727,17 @@ void panthor_sched_suspend(struct panthor_device *ptdev)
> * automatically terminate all active groups, so let's
> * force the state to halted here.
> */
> - if (csg_slot->group->state != PANTHOR_CS_GROUP_TERMINATED)
> + if (csg_slot->group->state != PANTHOR_CS_GROUP_TERMINATED) {
> csg_slot->group->state = PANTHOR_CS_GROUP_TERMINATED;
> +
> + /* Reset the queue slots manually if the termination
> + * request failed.
> + */
> + for (i = 0; i < group->queue_count; i++) {
group is used uninitialized which leads to a random (most likely NULL)
pointer deref. Either we go:
for (i = 0; i < csg_slot->group->queue_count; i++) {
and we move the group variable to the last for loop, so we're not
tempted to use it again in places where it's not initialized, or
we use the group variable consistently accross this function by having
group = csg_slot->group;
assignments where csg_slot->group is currently used.
We might also want to consider splitting this huge function in
sub-functions, but probably not in a patch that's flagged for
backporting.
> + if (group->queues[i])
> + cs_slot_reset_locked(ptdev, csg_id, i);
> + }
> + }
> slot_mask &= ~BIT(csg_id);
> }
> }
Powered by blists - more mailing lists