[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dfda6c38-fcd2-4d94-b9cd-c51972f40106@ursulin.net>
Date: Tue, 14 Oct 2025 13:05:44 +0100
From: Tvrtko Ursulin <tursulin@...ulin.net>
To: phasta@...nel.org, Matthew Brost <matthew.brost@...el.com>,
Danilo Krummrich <dakr@...nel.org>,
Christian König <ckoenig.leichtzumerken@...il.com>
Cc: dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/sched: Mandate usage of drm_sched_job_cleanup()
On 14/10/2025 12:56, Philipp Stanner wrote:
> On Mon, 2025-10-13 at 13:20 +0100, Tvrtko Ursulin wrote:
>>
>> On 26/09/2025 13:36, Philipp Stanner wrote:
>>> drm_sched_job_cleanup()'s documentation so far uses relatively soft
>>> language, only "recommending" usage of the function. To avoid memory
>>> leaks and, potentiall, other bugs, however, the function has to be used.
>>>
>>> Demand usage of the function explicitly.
>>>
>>> Signed-off-by: Philipp Stanner <phasta@...nel.org>
>>> ---
>>> drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 46119aacb809..0a9df9e61963 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -1030,7 +1030,7 @@ EXPORT_SYMBOL(drm_sched_job_has_dependency);
>>> *
>>> * Cleans up the resources allocated with drm_sched_job_init().
>>> *
>>> - * Drivers should call this from their error unwind code if @job is aborted
>>> + * Drivers need to call this from their error unwind code if @job is aborted
>>
>> Should is indeed wrong. I think it could be better to go with "shall" or
>> "must" though. Since those two are more standard language for this.
>
> "need to" is standard UK(?) English for "must to" I think.
>
> But I'm fine with "must"
Yeah will is standard English and IMO understandable. I was just
thinking whether it would best to stick to the established "requirements
language". For example:
https://www.nasa.gov/reference/appendix-c-how-to-write-a-good-requirement/
https://argondigital.com/blog/product-management/using-the-correct-terms-shall-will-should/
https://reqi.io/articles/shall-vs-should-vs-will-vs-must-whats-the-difference
But I am okay with need, your call.
>>> * before drm_sched_job_arm() is called.
>>> *
>>> * drm_sched_job_arm() is a point of no return since it initializes the fences
>>> @@ -1038,7 +1038,7 @@ EXPORT_SYMBOL(drm_sched_job_has_dependency);
>>> * submit it with drm_sched_entity_push_job() and cannot simply abort it by
>>> * calling drm_sched_job_cleanup().
>>> *
>>> - * This function should be called in the &drm_sched_backend_ops.free_job callback.
>>> + * This function must be called in the &drm_sched_backend_ops.free_job callback.
>>> */
>>> void drm_sched_job_cleanup(struct drm_sched_job *job)
>>> {
>>
>> Hm, having read the thread so far the situation seems not so easy to
>> untangle.
>>
>> On one hand I see Matt's point that free_job callback is a bit
>> misleadingly named and that there isn't anything really requiring
>> drm_sched_job_cleanup() to be called exactly from there. Free_job
>> semantics seem to be "job is done, *scheduler* is done with it".
>>
>> Drm_sched_job_cleanup() already says it has to be called after the point
>> of no return (arm) so it could be left at drivers discretion (de facto
>> state today) to choose how and when to do it.
>>
>> What I did not get is what is actually the perceived problem with
>> letting this stay? (Ie. "should" indicating it is recommended, but not a
>> must/shall.)
>
> I think the most correct formulation with our current mess would be
> "cleanup() must be called earliest in free_job(), or later (if there is
> a very good reason for that)."
>
> Question would be how to document that.
>
> I still have that big life time docu patch pending. Let me dig into how
> I could combine that..
Ok.
Regards,
Tvrtko
Powered by blists - more mailing lists