[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220118024007.1950576-88-sashal@kernel.org>
Date: Mon, 17 Jan 2022 21:39:39 -0500
From: Sasha Levin <sashal@...nel.org>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org
Cc: Lucas Stach <l.stach@...gutronix.de>,
Joerg Albert <joerg.albert@....de>,
Christian Gmeiner <christian.gmeiner@...il.com>,
Sasha Levin <sashal@...nel.org>, airlied@...ux.ie,
daniel@...ll.ch, etnaviv@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org
Subject: [PATCH AUTOSEL 5.10 088/116] drm/etnaviv: consider completed fence seqno in hang check
From: Lucas Stach <l.stach@...gutronix.de>
[ Upstream commit cdd156955f946beaa5f3a00d8ccf90e5a197becc ]
Some GPU heavy test programs manage to trigger the hangcheck quite often.
If there are no other GPU users in the system and the test program
exhibits a very regular structure in the commandstreams that are being
submitted, we can end up with two distinct submits managing to trigger
the hangcheck with the FE in a very similar address range. This leads
the hangcheck to believe that the GPU is stuck, while in reality the GPU
is already busy working on a different job. To avoid those spurious
GPU resets, also remember and consider the last completed fence seqno
in the hang check.
Reported-by: Joerg Albert <joerg.albert@....de>
Signed-off-by: Lucas Stach <l.stach@...gutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@...il.com>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---
drivers/gpu/drm/etnaviv/etnaviv_gpu.h | 1 +
drivers/gpu/drm/etnaviv/etnaviv_sched.c | 4 +++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
index 1c75c8ed5bcea..85eddd492774d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
@@ -130,6 +130,7 @@ struct etnaviv_gpu {
/* hang detection */
u32 hangcheck_dma_addr;
+ u32 hangcheck_fence;
void __iomem *mmio;
int irq;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index cd46c882269cc..026b6c0731198 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -106,8 +106,10 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
*/
dma_addr = gpu_read(gpu, VIVS_FE_DMA_ADDRESS);
change = dma_addr - gpu->hangcheck_dma_addr;
- if (change < 0 || change > 16) {
+ if (gpu->completed_fence != gpu->hangcheck_fence ||
+ change < 0 || change > 16) {
gpu->hangcheck_dma_addr = dma_addr;
+ gpu->hangcheck_fence = gpu->completed_fence;
goto out_no_timeout;
}
--
2.34.1
Powered by blists - more mailing lists