[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87f93819aed4784b3e280af38bd7b447821873f5.camel@mailbox.org>
Date: Tue, 14 Oct 2025 13:56:30 +0200
From: Philipp Stanner <phasta@...lbox.org>
To: Tvrtko Ursulin <tursulin@...ulin.net>, Philipp Stanner
<phasta@...nel.org>, Matthew Brost <matthew.brost@...el.com>, Danilo
Krummrich <dakr@...nel.org>, Christian König
<ckoenig.leichtzumerken@...il.com>
Cc: dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/sched: Mandate usage of drm_sched_job_cleanup()
On Mon, 2025-10-13 at 13:20 +0100, Tvrtko Ursulin wrote:
>
> On 26/09/2025 13:36, Philipp Stanner wrote:
> > drm_sched_job_cleanup()'s documentation so far uses relatively soft
> > language, only "recommending" usage of the function. To avoid memory
> > leaks and, potentiall, other bugs, however, the function has to be used.
> >
> > Demand usage of the function explicitly.
> >
> > Signed-off-by: Philipp Stanner <phasta@...nel.org>
> > ---
> > drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 46119aacb809..0a9df9e61963 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -1030,7 +1030,7 @@ EXPORT_SYMBOL(drm_sched_job_has_dependency);
> > *
> > * Cleans up the resources allocated with drm_sched_job_init().
> > *
> > - * Drivers should call this from their error unwind code if @job is aborted
> > + * Drivers need to call this from their error unwind code if @job is aborted
>
> Should is indeed wrong. I think it could be better to go with "shall" or
> "must" though. Since those two are more standard language for this.
"need to" is standard UK(?) English for "must to" I think.
But I'm fine with "must"
>
> > * before drm_sched_job_arm() is called.
> > *
> > * drm_sched_job_arm() is a point of no return since it initializes the fences
> > @@ -1038,7 +1038,7 @@ EXPORT_SYMBOL(drm_sched_job_has_dependency);
> > * submit it with drm_sched_entity_push_job() and cannot simply abort it by
> > * calling drm_sched_job_cleanup().
> > *
> > - * This function should be called in the &drm_sched_backend_ops.free_job callback.
> > + * This function must be called in the &drm_sched_backend_ops.free_job callback.
> > */
> > void drm_sched_job_cleanup(struct drm_sched_job *job)
> > {
>
> Hm, having read the thread so far the situation seems not so easy to
> untangle.
>
> On one hand I see Matt's point that free_job callback is a bit
> misleadingly named and that there isn't anything really requiring
> drm_sched_job_cleanup() to be called exactly from there. Free_job
> semantics seem to be "job is done, *scheduler* is done with it".
>
> Drm_sched_job_cleanup() already says it has to be called after the point
> of no return (arm) so it could be left at drivers discretion (de facto
> state today) to choose how and when to do it.
>
> What I did not get is what is actually the perceived problem with
> letting this stay? (Ie. "should" indicating it is recommended, but not a
> must/shall.)
I think the most correct formulation with our current mess would be
"cleanup() must be called earliest in free_job(), or later (if there is
a very good reason for that)."
Question would be how to document that.
I still have that big life time docu patch pending. Let me dig into how
I could combine that..
P.
>
> Regards,
>
> Tvrtko
>
Powered by blists - more mailing lists