linux-kernel - Re: [PATCH 3/5] drm/panfrost: Introduce JM context for manging job resources

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <duhc4uddygpsiccipe5ftv7mi4fz4khermst6jeqlkbigh6zmn@2btsx2yltsye>
Date: Mon, 8 Sep 2025 20:46:29 +0100
From: Adrián Larumbe <adrian.larumbe@...labora.com>
To: Daniel Stone <daniel@...ishbar.org>
Cc: linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org, 
	Boris Brezillon <boris.brezillon@...labora.com>, kernel@...labora.com, Rob Herring <robh@...nel.org>, 
	Steven Price <steven.price@....com>, Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, 
	Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, 
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>
Subject: Re: [PATCH 3/5] drm/panfrost: Introduce JM context for manging job
 resources

Hi Daniel,

On 30.08.2025 10:12, Daniel Stone wrote:
> Hi Adrian,
>
> On Thu, 28 Aug 2025 at 04:35, Adrián Larumbe
> <adrian.larumbe@...labora.com> wrote:
> > -void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
> > +int panfrost_jm_ctx_destroy(struct drm_file *file, u32 handle)
> >  {
> > -       struct panfrost_device *pfdev = panfrost_priv->pfdev;
> > -       int i;
> > +       struct panfrost_file_priv *priv = file->driver_priv;
> > +       struct panfrost_device *pfdev = priv->pfdev;
> > +       struct panfrost_jm_ctx *jm_ctx;
> >
> > -       for (i = 0; i < NUM_JOB_SLOTS; i++)
> > -               drm_sched_entity_destroy(&panfrost_priv->sched_entity[i]);
> > +       jm_ctx = xa_erase(&priv->jm_ctxs, handle);
> > +       if (!jm_ctx)
> > +               return -EINVAL;
> > +
> > +       for (u32 i = 0; i < ARRAY_SIZE(jm_ctx->slots); i++) {
> > +               if (jm_ctx->slots[i].enabled)
> > +                       drm_sched_entity_destroy(&jm_ctx->slots[i].sched_entity);
> > +       }
> >
> >         /* Kill in-flight jobs */
> >         spin_lock(&pfdev->js->job_lock);
> > -       for (i = 0; i < NUM_JOB_SLOTS; i++) {
> > -               struct drm_sched_entity *entity = &panfrost_priv->sched_entity[i];
> > -               int j;
> > +       for (u32 i = 0; i < ARRAY_SIZE(jm_ctx->slots); i++) {
> > +               struct drm_sched_entity *entity = &jm_ctx->slots[i].sched_entity;
> > +
> > +               if (!jm_ctx->slots[i].enabled)
> > +                       continue;
> >
> > -               for (j = ARRAY_SIZE(pfdev->jobs[0]) - 1; j >= 0; j--) {
> > +               for (int j = ARRAY_SIZE(pfdev->jobs[0]) - 1; j >= 0; j--) {
> >                         struct panfrost_job *job = pfdev->jobs[i][j];
> >                         u32 cmd;
> >
> > @@ -980,18 +1161,7 @@ void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
> >                 }
> >         }
> >         spin_unlock(&pfdev->js->job_lock);
> > -}
> > -
> > -int panfrost_job_is_idle(struct panfrost_device *pfdev)
> > -{
> > -       struct panfrost_job_slot *js = pfdev->js;
> > -       int i;
> > -
> > -       for (i = 0; i < NUM_JOB_SLOTS; i++) {
> > -               /* If there are any jobs in the HW queue, we're not idle */
> > -               if (atomic_read(&js->queue[i].sched.credit_count))
> > -                       return false;
> > -       }
> >
> > -       return true;
> > +       panfrost_jm_ctx_put(jm_ctx);
> > +       return 0;
> >  }
>
> It seems odd that both panfrost_jm_ctx_destroy() and
> panfrost_jm_ctx_release() share lifetime responsibilities. I'd expect
> calling panfrost_jm_ctx_destroy() to just release the xarray handle
> and drop the refcount.
>
> I can see why calling panfrost_jm_ctx_destroy() is the one to go try
> to cancel the jobs - because the jobs keep a refcount on the context,
> so we need to break that cycle somehow. But having both the
> handle-release and object-release function drop a ref on the sched
> entity seems odd?
>
> It doesn't help much that panfrost_job is used both for actual jobs
> (as the type) and the capability for a device to have multiple
> job-manager contexts (as a function prefix). Would be great to clean
> that up, so you don't have to think about whether e.g.
> panfrost_job_close() is actually operating on a panfrost_job, or
> operating on multiple panfrost_jm_ctx which operate on multiple
> panfrost_job.

I've submitted a v2 of this patch series. In the newer one, drm_sched_entity_destroy()
is only called in panfrost_jm_ctx_destroy(), whereas panfrost_jm_ctx_release() will just
free the memory associated with the context when the refcnt reaches 0.

As for the JS-and-panfrost_job function naming confusion, I'll deal with it in a
patch series I hope to submit before the end of this week.

> Cheers,
> Daniel

Adrian Larumbe