[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF6AEGsyFAOPmHqT7YX1wsukP4-gYAstCukr89r9w28V0YSCUw@mail.gmail.com>
Date: Thu, 17 Mar 2022 10:35:24 -0700
From: Rob Clark <robdclark@...il.com>
To: Christian König <christian.koenig@....com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@....com>,
dri-devel <dri-devel@...ts.freedesktop.org>,
freedreno <freedreno@...ts.freedesktop.org>,
linux-arm-msm <linux-arm-msm@...r.kernel.org>,
Rob Clark <robdclark@...omium.org>,
Sean Paul <sean@...rly.run>,
Abhinav Kumar <quic_abhinavk@...cinc.com>,
David Airlie <airlied@...ux.ie>,
Akhil P Oommen <quic_akhilpo@...cinc.com>,
Jonathan Marek <jonathan@...ek.ca>,
AngeloGioacchino Del Regno
<angelogioacchino.delregno@...labora.com>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Vladimir Lypak <vladimir.lypak@...il.com>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
On Thu, Mar 17, 2022 at 9:45 AM Christian König
<christian.koenig@....com> wrote:
>
> Am 17.03.22 um 17:18 schrieb Rob Clark:
> > On Thu, Mar 17, 2022 at 9:04 AM Christian König
> > <christian.koenig@....com> wrote:
> >> Am 17.03.22 um 16:10 schrieb Rob Clark:
> >>> [SNIP]
> >>> userspace frozen != kthread frozen .. that is what this patch is
> >>> trying to address, so we aren't racing between shutting down the hw
> >>> and the scheduler shoveling more jobs at us.
> >> Well exactly that's the problem. The scheduler is supposed to shoveling
> >> more jobs at us until it is empty.
> >>
> >> Thinking more about it we will then keep some dma_fence instance
> >> unsignaled and that is and extremely bad idea since it can lead to
> >> deadlocks during suspend.
> > Hmm, perhaps that is true if you need to migrate things out of vram?
> > It is at least not a problem when vram is not involved.
>
> No, it's much wider than that.
>
> See what can happen is that the memory management shrinkers want to wait
> for a dma_fence during suspend.
we don't wait on fences in shrinker, only purging or evicting things
that are already ready. Actually, waiting on fences in shrinker path
sounds like a pretty bad idea.
> And if you stop the scheduler they will just wait forever.
>
> What you need to do instead is to drain the scheduler, e.g. call
> drm_sched_entity_flush() with a proper timeout for each entity you have
> created.
yeah, it would work to drain the scheduler.. I guess that might be the
more portable approach as far as generic solution for suspend.
BR,
-R
> Regards,
> Christian.
>
> >
> >> So this patch here is an absolute clear NAK from my side. If amdgpu is
> >> doing something similar that is a severe bug and needs to be addressed
> >> somehow.
> > I think amdgpu's use of kthread_park is not related to suspend, but
> > didn't look too closely.
> >
> > And perhaps the solution for this problem is more complex in the case
> > of amdgpu, I'm not super familiar with the constraints there. But I
> > think it is a fine solution for integrated GPUs.
> >
> > BR,
> > -R
> >
> >> Regards,
> >> Christian.
> >>
> >>> BR,
> >>> -R
> >>>
>
Powered by blists - more mailing lists