[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aP+lIKQ6FHLltgcl@lstrano-desk.jf.intel.com>
Date: Mon, 27 Oct 2025 10:00:16 -0700
From: Matthew Brost <matthew.brost@...el.com>
To: Philipp Stanner <pstanner@...hat.com>
CC: <intel-xe@...ts.freedesktop.org>, <dri-devel@...ts.freedesktop.org>,
<linux-kernel@...r.kernel.org>, <jiangshanlai@...il.com>, <tj@...nel.org>,
<simona.vetter@...ll.ch>, <christian.koenig@....com>, <dakr@...nel.org>
Subject: Re: [RFC PATCH 2/3] drm/sched: Taint workqueues with reclaim
On Mon, Oct 27, 2025 at 12:03:33PM +0100, Philipp Stanner wrote:
> On Tue, 2025-10-21 at 14:39 -0700, Matthew Brost wrote:
> > Multiple drivers seemingly do not understand the role of DMA fences in
> > the reclaim path. As a result,
> >
>
> result of what? The "role of DMA fences"?
>
> > DRM scheduler workqueues, which are part
> > of the fence signaling path, must not allocate memory.
> >
>
> Should be phrased differently. The actual rule here is "The GPU
> scheduler's workqueues can be used for memory reclaim. Because of that,
> work items on these queues must not allocate memory."
>
Sure, will reword.
> --
>
> In general, I often read in commits or discussions about this or that
> "rule", especially "DMA fence rules", but they're often not detailed
> very much.
>
Yes, I kinda assume the audience reviewing any dma-buf or drm-sched
really understand the "DMA fence rules" compare to driver devs which
often do not really get this concept. Taining the work queues here will
help driver devs avoid mistakes and hopefully along the way get them to
point where they understand "DMA fence rules" - it took me a few years
to really get this rules.
Matt
>
> P.
>
> > This patch
> > teaches lockdep to recognize these rules in order to catch driver-side
> > bugs.
> >
> > Cc: Christian König <christian.koenig@....com>
> > Cc: Danilo Krummrich <dakr@...nel.org>
> > Cc: Matthew Brost <matthew.brost@...el.com>
> > Cc: Philipp Stanner <phasta@...nel.org>
> > Cc: dri-devel@...ts.freedesktop.org
> > Signed-off-by: Matthew Brost <matthew.brost@...el.com>
> > ---
> > drivers/gpu/drm/scheduler/sched_main.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index c39f0245e3a9..676484dd3ea3 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -1368,6 +1368,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_
> > atomic64_set(&sched->job_id_count, 0);
> > sched->pause_submit = false;
> >
> > + taint_reclaim_workqueue(sched->submit_wq, GFP_KERNEL);
> > + taint_reclaim_workqueue(sched->timeout_wq, GFP_KERNEL);
> > +
> > sched->ready = true;
> > return 0;
> > Out_unroll:
>
Powered by blists - more mailing lists