lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1657346375-1461-1-git-send-email-quic_akhilpo@quicinc.com>
Date:   Sat, 9 Jul 2022 11:29:28 +0530
From:   Akhil P Oommen <quic_akhilpo@...cinc.com>
To:     freedreno <freedreno@...ts.freedesktop.org>,
        <dri-devel@...ts.freedesktop.org>, <linux-arm-msm@...r.kernel.org>,
        Rob Clark <robdclark@...il.com>,
        Bjorn Andersson <bjorn.andersson@...aro.org>
CC:     Jonathan Marek <jonathan@...ek.ca>,
        Jordan Crouse <jordan@...micpenguin.net>,
        Matthias Kaehlcke <mka@...omium.org>,
        "Douglas Anderson" <dianders@...omium.org>,
        Akhil P Oommen <quic_akhilpo@...cinc.com>,
        Abhinav Kumar <quic_abhinavk@...cinc.com>,
        Andy Gross <agross@...nel.org>, Chia-I Wu <olvaffe@...il.com>,
        Christian König <christian.koenig@....com>,
        Daniel Vetter <daniel@...ll.ch>,
        David Airlie <airlied@...ux.ie>,
        Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
        "Geert Uytterhoeven" <geert@...ux-m68k.org>,
        Konrad Dybcio <konrad.dybcio@...ainline.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Rob Herring <robh+dt@...nel.org>,
        "Sean Paul" <sean@...rly.run>, Stephen Boyd <swboyd@...omium.org>,
        Wang Qing <wangqing@...o.com>, <devicetree@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: [PATCH v2 0/7] Improve GPU Recovery


Recently, I debugged a few device crashes which occured during recovery
after a hangcheck timeout. It looks like there are a few things we can
do to improve our chance at a successful gpu recovery.

First one is to ensure that CX GDSC collapses which clears the internal
states in gpu's CX domain. First 5 patches tries to handle this.

Rest of the patches are to ensure that few internal blocks like CP, GMU
and GBIF are halted properly before proceeding for a snapshot followed by
recovery. Also, handle 'prepare slumber' hfi failure correctly. These
are A6x specific improvements.

Changes in v2:
- Rebased on msm-next tip

Akhil P Oommen (7):
  drm/msm: Remove unnecessary pm_runtime_get/put
  drm/msm: Correct pm_runtime votes in recover worker
  drm/msm: Fix cx collapse issue during recovery
  drm/msm: Ensure cx gdsc collapse during recovery
  arm64: dts: qcom: sc7280: Update gpu register list
  drm/msm/a6xx: Improve gpu recovery sequence
  drm/msm/a6xx: Handle GMU prepare-slumber hfi failure

 arch/arm64/boot/dts/qcom/sc7280.dtsi  |  6 ++-
 drivers/gpu/drm/msm/adreno/a6xx.xml.h |  4 ++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 83 ++++++++++++++++++++++-------------
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 36 +++++++++++++--
 drivers/gpu/drm/msm/msm_gpu.c         |  9 ++--
 drivers/gpu/drm/msm/msm_gpu.h         |  1 +
 drivers/gpu/drm/msm/msm_ringbuffer.c  |  4 --
 7 files changed, 100 insertions(+), 43 deletions(-)

-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ