[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1659174051-27816-1-git-send-email-quic_akhilpo@quicinc.com>
Date: Sat, 30 Jul 2022 15:10:43 +0530
From: Akhil P Oommen <quic_akhilpo@...cinc.com>
To: freedreno <freedreno@...ts.freedesktop.org>,
<dri-devel@...ts.freedesktop.org>, <linux-arm-msm@...r.kernel.org>,
Rob Clark <robdclark@...il.com>,
Bjorn Andersson <bjorn.andersson@...aro.org>
CC: Jordan Crouse <jordan@...micpenguin.net>,
Jonathan Marek <jonathan@...ek.ca>,
Douglas Anderson <dianders@...omium.org>,
"Matthias Kaehlcke" <mka@...omium.org>,
Akhil P Oommen <quic_akhilpo@...cinc.com>,
Abhinav Kumar <quic_abhinavk@...cinc.com>,
AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>,
Chia-I Wu <olvaffe@...il.com>,
Dan Carpenter <dan.carpenter@...cle.com>,
Daniel Vetter <daniel@...ll.ch>,
David Airlie <airlied@...ux.ie>,
Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
Nathan Chancellor <nathan@...nel.org>,
"Philipp Zabel" <p.zabel@...gutronix.de>,
Sean Paul <sean@...rly.run>,
Stephen Boyd <swboyd@...omium.org>,
Vladimir Lypak <vladimir.lypak@...il.com>,
Wang Qing <wangqing@...o.com>, <linux-kernel@...r.kernel.org>
Subject: [PATCH v3 0/8] Improve GPU Recovery
Recently, I debugged a few device crashes which occured during recovery
after a hangcheck timeout. It looks like there are a few things we can
do to improve our chance at a successful gpu recovery.
First one is to ensure that CX GDSC collapses which clears the internal
states in gpu's CX domain. First 5 patches tries to handle this.
Rest of the patches are to ensure that few internal blocks like CP, GMU
and GBIF are halted properly before proceeding for a snapshot followed by
recovery. Also, handle 'prepare slumber' hfi failure correctly. These
are A6x specific improvements.
This series is rebased on top of [1] which based on linus's master
branch.
[1] https://patchwork.freedesktop.org/series/106860/
Changes in v3:
- Use reset interface from gpucc driver to poll for cx gdsc collapse
https://patchwork.freedesktop.org/series/106860/
- Use single pm refcount for all active submits
Changes in v2:
- Rebased on msm-next tip
Akhil P Oommen (8):
drm/msm: Remove unnecessary pm_runtime_get/put
drm/msm: Take single rpm refcount on behalf of all submits
drm/msm: Correct pm_runtime votes in recover worker
drm/msm: Fix cx collapse issue during recovery
drm/msm/a6xx: Ensure CX collapse during gpu recovery
drm/msm/adreno: Remove a WARN() during runtime_suspend
drm/msm/a6xx: Improve gpu recovery sequence
drm/msm/a6xx: Handle GMU prepare-slumber hfi failure
drivers/gpu/drm/msm/adreno/a6xx.xml.h | 4 ++
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 83 +++++++++++++++++++-----------
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 35 +++++++++++--
drivers/gpu/drm/msm/adreno/adreno_device.c | 7 ---
drivers/gpu/drm/msm/msm_gpu.c | 21 +++++---
drivers/gpu/drm/msm/msm_gpu.h | 4 ++
drivers/gpu/drm/msm/msm_ringbuffer.c | 4 --
7 files changed, 106 insertions(+), 52 deletions(-)
--
2.7.4
Powered by blists - more mailing lists