lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250625143722.55272-1-pierre-eric.pelloux-prayer@amd.com>
Date: Wed, 25 Jun 2025 16:37:21 +0200
From: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
To: Matthew Brost <matthew.brost@...el.com>, Danilo Krummrich
	<dakr@...nel.org>, Philipp Stanner <phasta@...nel.org>,
	Christian König <ckoenig.leichtzumerken@...il.com>,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard
	<mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, David Airlie
	<airlied@...il.com>, Simona Vetter <simona@...ll.ch>, Tvrtko Ursulin
	<tvrtko.ursulin@...lia.com>, Pierre-Eric Pelloux-Prayer
	<pierre-eric.pelloux-prayer@....com>
CC: <dri-devel@...ts.freedesktop.org>, <linux-kernel@...r.kernel.org>
Subject: [PATCH v1] drm/sched: Fix racy access to drm_sched_entity.dependency

The drm_sched_job_unschedulable trace point can access
entity->dependency after it was cleared by the callback
installed in drm_sched_entity_add_dependency_cb, causing:

BUG: kernel NULL pointer dereference, address: 0000000000000020
Workqueue: comp_1.1.0 drm_sched_run_job_work [gpu_sched]
RIP: 0010:trace_event_raw_event_drm_sched_job_unschedulable+0x70/0xd0 [gpu_sched]

To fix this we either need to take a reference of the fence before
setting up the callbacks, or move the trace_drm_sched_job_unschedulable
calls into drm_sched_entity_add_dependency_cb where they can be
done earlier. The former option is the more correct one because with
the latter we might emit the event and still be able to schedule the
job if the fence is signaled in-between. Despite that, I've
implemented the latter, since it's a bit simpler and the extra event
is not a deal breaker for tools anyway.

Fixes: 76d97c870f29 (drm/sched: Trace dependencies for GPU jobs)

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@....com>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 5635b3a826d8..a23b204cac5c 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -401,7 +401,8 @@ EXPORT_SYMBOL(drm_sched_entity_set_priority);
  * Add a callback to the current dependency of the entity to wake up the
  * scheduler when the entity becomes available.
  */
-static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
+static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity,
+					       struct drm_sched_job *sched_job)
 {
 	struct drm_gpu_scheduler *sched = entity->rq->sched;
 	struct dma_fence *fence = entity->dependency;
@@ -429,6 +430,11 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
 		fence = dma_fence_get(&s_fence->scheduled);
 		dma_fence_put(entity->dependency);
 		entity->dependency = fence;
+
+		if (trace_drm_sched_job_unschedulable_enabled() &&
+		    !dma_fence_is_signaled(fence))
+			trace_drm_sched_job_unschedulable(sched_job, fence);
+
 		if (!dma_fence_add_callback(fence, &entity->cb,
 					    drm_sched_entity_clear_dep))
 			return true;
@@ -438,6 +444,10 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
 		return false;
 	}
 
+	if (trace_drm_sched_job_unschedulable_enabled() &&
+	    !dma_fence_is_signaled(entity->dependency))
+		trace_drm_sched_job_unschedulable(sched_job, entity->dependency);
+
 	if (!dma_fence_add_callback(entity->dependency, &entity->cb,
 				    drm_sched_entity_wakeup))
 		return true;
@@ -478,10 +488,8 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 
 	while ((entity->dependency =
 			drm_sched_job_dependency(sched_job, entity))) {
-		if (drm_sched_entity_add_dependency_cb(entity)) {
-			trace_drm_sched_job_unschedulable(sched_job, entity->dependency);
+		if (drm_sched_entity_add_dependency_cb(entity, sched_job))
 			return NULL;
-		}
 	}
 
 	/* skip jobs from entity that marked guilty */
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ