[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221109190937.64155-2-janusz.krzysztofik@linux.intel.com>
Date: Wed, 9 Nov 2022 20:09:35 +0100
From: Janusz Krzysztofik <janusz.krzysztofik@...ux.intel.com>
To: Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>
Cc: Jani Nikula <jani.nikula@...ux.intel.com>,
Rodrigo Vivi <rodrigo.vivi@...el.com>,
Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
David Airlie <airlied@...il.com>,
Daniel Vetter <daniel@...ll.ch>,
Chris Wilson <chris.p.wilson@...el.com>,
intel-gfx@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: [PATCH 1/3] drm/i915: Fix timeout handling when retiring requests
I believe that intel_gt_retire_requests_timeout() should return either
-ETIME if all time designated by timeout argument has been consumed while
waiting for fences being signaled, or remaining time if there are requests
still not retired, or 0 otherwise. In the latter case, remaining time
should be passed back via remaining_timeout argument.
Remaining time is updated with return value of each consecutive call to
dma_fence_wait_timeout(). If an error code is returned instead of
remaining time, a few potentially unexpected side effects occur:
- we no longer wait for consecutive timelines' last request fences being
signaled before we try to retire requests from those timelines -- while
expected in case of -ETIME, that's probably not intended in case of
other errors that dma_fence_wait_timeout() can return,
- the error code (a negative value) is passed back as remaining time and
if no more requests happen to be left pending despite the error, a user
may pass that value forward as a remaining timeout -- that can
potentially trigger a WARN or BUG,
- potentially unexpected error code is returned to user when a
non-critical error that probably shouldn't stop the user from retrying
occurs while active requests are still pending.
Moreover, should dma_fence_wait_timeout() ever return 0 (which should mean
timeout expiration) while we are processing requests and there are still
pending requests when we are about to return, that 0 value is returned to
user like if all requests were successfully retired.
Ignore error codes from dma_fence_wait_timeout() other than -ETIME and
don't overwrite remaining time with those error codes. Also, convert 0
value returned by dma_fence_wait_timeout() to -ETIME.
Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@...ux.intel.com>
Cc: stable@...r.kernel.org # v5.5+
---
drivers/gpu/drm/i915/gt/intel_gt_requests.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index edb881d756309..6c3b8ac3055c3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -156,11 +156,22 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
fence = i915_active_fence_get(&tl->last_request);
if (fence) {
+ signed long time_left;
+
mutex_unlock(&tl->mutex);
- timeout = dma_fence_wait_timeout(fence,
- true,
- timeout);
+ time_left = dma_fence_wait_timeout(fence,
+ true,
+ timeout);
+ /*
+ * 0 or -ETIME: timeout expired
+ * other errors: ignore, assume no time consumed
+ */
+ if (time_left == -ETIME || time_left == 0)
+ timeout = -ETIME;
+ else if (time_left > 0)
+ timeout = time_left;
+
dma_fence_put(fence);
/* Retirement is best effort */
--
2.25.1
Powered by blists - more mailing lists