[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251028134602.94125-4-phasta@kernel.org>
Date: Tue, 28 Oct 2025 14:46:02 +0100
From: Philipp Stanner <phasta@...nel.org>
To: Matthew Brost <matthew.brost@...el.com>,
	Danilo Krummrich <dakr@...nel.org>,
	Philipp Stanner <phasta@...nel.org>,
	Christian König <ckoenig.leichtzumerken@...il.com>,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Maxime Ripard <mripard@...nel.org>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	David Airlie <airlied@...il.com>,
	Simona Vetter <simona@...ll.ch>,
	tursulin@...ulin.net
Cc: dri-devel@...ts.freedesktop.org,
	linux-kernel@...r.kernel.org
Subject: [PATCH 2/2] drm/sched: Add FIXME detailing potential hang
If a job from a ready entity needs more credits than are currently
available, drm_sched_run_job_work() (a work item) simply returns and
doesn't reschedule itself. The scheduler is only woken up again when the
next job gets pushed with drm_sched_entity_push_job().
If someone submits a job that needs too many credits and doesn't submit
more jobs afterwards, this would lead to the scheduler never pulling the
too-expensive job, effectively hanging forever.
Document this problem as a FIXME.
Signed-off-by: Philipp Stanner <phasta@...nel.org>
---
 drivers/gpu/drm/scheduler/sched_main.c | 10 ++++++++++
 1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 492e8af639db..eaf8d17b2a66 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1237,6 +1237,16 @@ static void drm_sched_run_job_work(struct work_struct *w)
 
 	/* Find entity with a ready job */
 	entity = drm_sched_select_entity(sched);
+	/*
+	 * FIXME:
+	 * The entity can be NULL when the scheduler currently has no capacity
+	 * (credits) for more jobs. If that happens, the work item terminates
+	 * itself here, without rescheduling itself.
+	 *
+	 * It only gets started again in drm_sched_entity_push_job(). IOW, the
+	 * scheduler might hang forever if a job that needs too many credits
+	 * gets submitted to an entity and no other, subsequent jobs are.
+	 */
 	if (!entity) {
 		/*
 		 * Either no more work to do, or the next ready job needs more
-- 
2.49.0
Powered by blists - more mailing lists
 
