linux-kernel - [PATCH 1/2] drm/sched: Fix drm_sched_entity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241119134122.21950-2-pstanner@redhat.com>
Date: Tue, 19 Nov 2024 14:41:22 +0100
From: Philipp Stanner <pstanner@...hat.com>
To: Luben Tuikov <ltuikov89@...il.com>,
	Matthew Brost <matthew.brost@...el.com>,
	Danilo Krummrich <dakr@...nel.org>,
	Philipp Stanner <pstanner@...hat.com>,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Maxime Ripard <mripard@...nel.org>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	David Airlie <airlied@...il.com>,
	Simona Vetter <simona@...ll.ch>
Cc: dri-devel@...ts.freedesktop.org,
	linux-kernel@...r.kernel.org,
	Christian König <christian.koenig@....com>
Subject: [PATCH 1/2] drm/sched: Fix drm_sched_entity_flush() return val

The documentation of drm_sched_entity_flush() states that the function
shall - always - return the remaining timeout time in jiffies.

However, that is not what the function actually does; in one of its if
branches it simply returns the unchanged timeout value.

Furthermore, the used function wait_event_timeout() doesn't always
return the remaining timeout time.

Adjust the function so that it actually does what the documentation
states it shall do.

Cc: Christian König <christian.koenig@....com>
Signed-off-by: Philipp Stanner <pstanner@...hat.com>
---
@AMD:
You guys are the only ones who use the function's return code, so I
leave it completely up to you to decide what behavior you want.

But we should at least do something, because right now function
documentation and behavior do not match.

P.
---
 drivers/gpu/drm/scheduler/sched_entity.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index a75eede8bf8d..16b172aee453 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -278,7 +278,7 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
  * waiting, removes the entity from the runqueue and returns an error when the
  * process was killed.
  *
- * Returns the remaining time in jiffies left from the input timeout
+ * Returns: 0 if the timeout ellapsed, the remaining time otherwise.
  */
 long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 {
@@ -294,15 +294,24 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 	 * The client will not queue more IBs during this fini, consume existing
 	 * queued IBs or discard them on SIGKILL
 	 */
-	if (current->flags & PF_EXITING) {
-		if (timeout)
-			ret = wait_event_timeout(
-					sched->job_scheduled,
-					drm_sched_entity_is_idle(entity),
-					timeout);
+	if (timeout != 0 && (current->flags & PF_EXITING)) {
+		ret = wait_event_timeout(sched->job_scheduled,
+				drm_sched_entity_is_idle(entity),
+				timeout);
+		/*
+		 * wait_event_timeout() returns 1 if it timed out but the
+		 * condition became true on timeout. We only care about whether
+		 * it timed out or not.
+		 */
+		if (ret == 1)
+			ret = 0;
 	} else {
 		wait_event_killable(sched->job_scheduled,
 				    drm_sched_entity_is_idle(entity));
+
+		ret -= (long)get_jiffies_64();
+		if (ret < 0)
+			ret = 0;
 	}
 
 	/* For killed process disable any more IBs enqueue right now */
-- 
2.47.0