[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADaigPXJJoEgWK6nx8yc_DVsDAv1VdzuA5NYZO63K=hHVvT2JQ@mail.gmail.com>
Date: Fri, 1 May 2020 12:26:40 -0700
From: Eric Anholt <eric@...olt.net>
To: Jordan Crouse <jcrouse@...eaurora.org>
Cc: linux-arm-msm@...r.kernel.org, stable@...r.kernel.org,
Akhil P Oommen <akhilpo@...eaurora.org>,
AngeloGioacchino Del Regno <kholk11@...il.com>,
Ben Dooks <ben.dooks@...ethink.co.uk>,
Daniel Vetter <daniel@...ll.ch>,
David Airlie <airlied@...ux.ie>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jeffrey Hugo <jeffrey.l.hugo@...il.com>,
"Michael J. Ruhl" <michael.j.ruhl@...el.com>,
Rob Clark <robdclark@...il.com>, Sean Paul <sean@...rly.run>,
Sharat Masetty <smasetty@...eaurora.org>,
Thomas Gleixner <tglx@...utronix.de>,
DRI Development <dri-devel@...ts.freedesktop.org>,
freedreno@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] drm/msm: Check for powered down HW in the devfreq callbacks
On Fri, May 1, 2020 at 12:03 PM Jordan Crouse <jcrouse@...eaurora.org> wrote:
>
> Writing to the devfreq sysfs nodes while the GPU is powered down can
> result in a system crash (on a5xx) or a nasty GMU error (on a6xx):
>
> $ /sys/class/devfreq/5000000.gpu# echo 500000000 > min_freq
> [ 104.841625] platform 506a000.gmu: [drm:a6xx_gmu_set_oob]
> *ERROR* Timeout waiting for GMU OOB set GPU_DCVS: 0x0
>
> Despite the fact that we carefully try to suspend the devfreq device when
> the hardware is powered down there are lots of holes in the governors that
> don't check for the suspend state and blindly call into the devfreq
> callbacks that end up triggering hardware reads in the GPU driver.
>
> Call pm_runtime_get_if_in_use() in the gpu_busy() and gpu_set_freq()
> callbacks to skip the hardware access if it isn't active.
>
> v2: Use pm_runtime_get_if_in_use() per Eric Anholt
>
> Cc: stable@...r.kernel.org
> Signed-off-by: Jordan Crouse <jcrouse@...eaurora.org>
> ---
>
> drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 ++++++
> drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 8 ++++++++
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 7 +++++++
> 3 files changed, 21 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 724024a2243a..4d7f269edfcc 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -1404,6 +1404,10 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
> {
> u64 busy_cycles, busy_time;
>
> + /* Only read the gpu busy if the hardware is already active */
> + if (pm_runtime_get_if_in_use(&gpu->pdev->dev) <= 0)
> + return 0;
> +
RPM's APIs are a bit of a trap and will return a negative errno for
the get functions if runtime PM is disabled in kconfig, even though
usually that would mean that the power domain is not ever disabled by
RPM. I think in these checks you want "if (pm_runtime_get_if_in_use()
== 0)", and that seems to be a common pattern in other drivers. With
that,
Reviewed-by: Eric Anholt <eric@...olt.net>
(and tested, too)
Powered by blists - more mailing lists