linux-kernel - Re: [PATCH 1/2] drm/msm/gpu: Wait for idle before suspending

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAF6AEGsO1f5DC8AWjjA+9XLne3CRMGsLfLwWbv3iQVZW3wUTiw@mail.gmail.com>
Date:   Sat, 8 Jan 2022 09:41:35 -0800
From:   Rob Clark <robdclark@...il.com>
To:     Stephen Boyd <swboyd@...omium.org>
Cc:     dri-devel <dri-devel@...ts.freedesktop.org>,
        freedreno <freedreno@...ts.freedesktop.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        Rob Clark <robdclark@...omium.org>,
        Sean Paul <sean@...rly.run>,
        Abhinav Kumar <quic_abhinavk@...cinc.com>,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        AngeloGioacchino Del Regno 
        <angelogioacchino.delregno@...labora.com>,
        Jonathan Marek <jonathan@...ek.ca>,
        Jordan Crouse <jordan@...micpenguin.net>,
        Akhil P Oommen <quic_akhilpo@...cinc.com>,
        Vladimir Lypak <vladimir.lypak@...il.com>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] drm/msm/gpu: Wait for idle before suspending

On Fri, Jan 7, 2022 at 4:27 PM Stephen Boyd <swboyd@...omium.org> wrote:
>
> Quoting Rob Clark (2022-01-06 10:14:46)
> > From: Rob Clark <robdclark@...omium.org>
> >
> > System suspend uses pm_runtime_force_suspend(), which cheekily bypasses
> > the runpm reference counts.  This doesn't actually work so well when the
> > GPU is active.  So add a reasonable delay waiting for the GPU to become
> > idle.
>
> Maybe also say:
>
> Failure to wait during system wide suspend leads to GPU hangs seen on
> resume.

The fallout can actually be a lot more than just GPU hangs.. that is
just the case that is easy (for us) to observe because the crash
logging captures them.  But sync/async external aborts are also
possible.. and I think even just undefined behavior (ie. I think if
the timing works out right, it can survive but just "lose" rendering
that hadn't completed yet)

> >
> > Alternatively we could just return -EBUSY in this case, but that has the
> > disadvantage of causing system suspend to fail.
> >
> > Signed-off-by: Rob Clark <robdclark@...omium.org>
> > ---
> >  drivers/gpu/drm/msm/adreno/adreno_device.c | 9 +++++++++
> >  drivers/gpu/drm/msm/msm_gpu.c              | 3 +++
> >  drivers/gpu/drm/msm/msm_gpu.h              | 3 +++
> >  3 files changed, 15 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 93005839b5da..b677ca3fd75e 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -611,6 +611,15 @@ static int adreno_resume(struct device *dev)
> >  static int adreno_suspend(struct device *dev)
> >  {
> >         struct msm_gpu *gpu = dev_to_gpu(dev);
> > +       int ret = 0;
>
> Please don't assign and then immediately overwrite.
>
> > +
> > +       ret = wait_event_timeout(gpu->retire_event,
> > +                                !msm_gpu_active(gpu),
> > +                                msecs_to_jiffies(1000));
> > +       if (ret == 0) {
>
> The usual pattern is
>
>         long timeleft;
>
>         timeleft = wait_event_timeout(...)
>         if (!timeleft) {
>                 /* no time left; timed out */
>
> Can it be the same pattern here? It helps because people sometimes
> forget that wait_event_timeout() returns the time that is left and not
> an error code when it times out.

ok, I'll update in v2..

BR,
-R

> > +               dev_err(dev, "Timeout waiting for GPU to suspend\n");
> > +               return -EBUSY;
> > +       }
> >
> >         return gpu->funcs->pm_suspend(gpu);
> >  }