[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fae980fa-e173-4921-90e2-6a4f6b8833a8@igalia.com>
Date: Tue, 3 Jun 2025 13:27:26 +0100
From: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
To: Philipp Stanner <phasta@...nel.org>, Lyude Paul <lyude@...hat.com>,
Danilo Krummrich <dakr@...nel.org>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Matthew Brost <matthew.brost@...el.com>,
Christian König <ckoenig.leichtzumerken@...il.com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
Sumit Semwal <sumit.semwal@...aro.org>,
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
Cc: dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, linux-media@...r.kernel.org
Subject: Re: [RFC PATCH 0/6] drm/sched: Avoid memory leaks by canceling
job-by-job
On 03/06/2025 10:31, Philipp Stanner wrote:
> An alternative version to [1], based on Tvrtko's suggestion from [2].
>
> I tested this for Nouveau. Works.
>
> I'm having, however, bigger problems properly porting the unit tests and
> have seen various explosions. In the process I noticed that some things
> in the unit tests aren't right and a bit of a larger rework will be
> necessary (for example, the timedout job callback must signal the
> timedout fence, remove it from the list and so on).
General approach I follow when implementing any mock component is to
implement only as much is needed for a test to pass. Only add more and
rework when a test/functionality is added which requires it.
Specifically for timedout callback signaling I see that I had exactly
that added in the patch you linked as [2].
> Anyways. Please comment on the general idea.
I am obviously okay with it. :) Especially now that you verified it
works well for nouveau.
What I am not that ecstatic about is only getting the Suggested-by
credit in 1/6. Given it is basically my patch with some cosmetic changes
like the kernel doc and the cancel loop extracted to a helper.
> @Tvrtko: As briefly brainstormed about on IRC, if you'd be willing to
> take care of the unit tests patch, I could remove that one (and,
> maaaaybe, the warning print patch) from the series and we could merge
> this RFC's successor version %N once it's ready. What do you think?
Okay in principle but the first thing I would suggest you could try is
to take my unit tests adaptations from [2] verbatim. Benefit of keeping
everything in one series is more confidence we are merging a solid
thing. But I can take it on myself as a follow up too if you want.
Regards,
Tvrtko
>
> P.
>
> [1] https://lore.kernel.org/dri-devel/20250522082742.148191-2-phasta@kernel.org/
> [2] https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/
>
> Philipp Stanner (6):
> drm/sched: Avoid memory leaks with cancel_job() callback
> drm/sched/tests: Implement cancel_job()
> drm/sched: Warn if pending list is not empty
> drm/nouveau: Make fence container helper usable driver-wide
> drm/nouveau: Add new callback for scheduler teardown
> drm/nouveau: Remove waitque for sched teardown
>
> drivers/gpu/drm/nouveau/nouveau_fence.c | 35 +++++----
> drivers/gpu/drm/nouveau/nouveau_fence.h | 7 ++
> drivers/gpu/drm/nouveau/nouveau_sched.c | 35 +++++----
> drivers/gpu/drm/nouveau/nouveau_sched.h | 9 +--
> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 8 +--
> drivers/gpu/drm/scheduler/sched_main.c | 37 ++++++----
> .../gpu/drm/scheduler/tests/mock_scheduler.c | 71 +++++++------------
> drivers/gpu/drm/scheduler/tests/sched_tests.h | 4 +-
> include/drm/gpu_scheduler.h | 9 +++
> 9 files changed, 115 insertions(+), 100 deletions(-)
>
Powered by blists - more mailing lists