[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF6AEGvsEzsAQp6umSN5-mH3onuXDj+_uK=jrwHk7U95x0PeFQ@mail.gmail.com>
Date: Mon, 5 Oct 2020 15:58:10 -0700
From: Rob Clark <robdclark@...il.com>
To: Ville Syrjälä <ville.syrjala@...ux.intel.com>,
Rob Clark <robdclark@...il.com>,
Rob Clark <robdclark@...omium.org>,
linux-arm-msm <linux-arm-msm@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>,
Tim Murray <timmurray@...gle.com>,
dri-devel <dri-devel@...ts.freedesktop.org>,
Tejun Heo <tj@...nel.org>, Qais Yousef <qais.yousef@....com>
Cc: Daniel Vetter <daniel@...ll.ch>
Subject: Re: [PATCH v2 0/3] drm: commit_work scheduling
On Mon, Oct 5, 2020 at 7:15 AM Daniel Vetter <daniel@...ll.ch> wrote:
>
> On Mon, Oct 05, 2020 at 03:15:24PM +0300, Ville Syrjälä wrote:
> > On Fri, Oct 02, 2020 at 10:55:52AM -0700, Rob Clark wrote:
> > > On Fri, Oct 2, 2020 at 4:05 AM Ville Syrjälä
> > > <ville.syrjala@...ux.intel.com> wrote:
> > > >
> > > > On Fri, Oct 02, 2020 at 01:52:56PM +0300, Ville Syrjälä wrote:
> > > > > On Thu, Oct 01, 2020 at 05:25:55PM +0200, Daniel Vetter wrote:
> > > > > > On Thu, Oct 1, 2020 at 5:15 PM Rob Clark <robdclark@...il.com> wrote:
> > > > > > >
> > > > > > > On Thu, Oct 1, 2020 at 12:25 AM Daniel Vetter <daniel@...ll.ch> wrote:
> > > > > > > >
> > > > > > > > On Wed, Sep 30, 2020 at 11:16 PM Rob Clark <robdclark@...il.com> wrote:
> > > > > > > > >
> > > > > > > > > From: Rob Clark <robdclark@...omium.org>
> > > > > > > > >
> > > > > > > > > The android userspace treats the display pipeline as a realtime problem.
> > > > > > > > > And arguably, if your goal is to not miss frame deadlines (ie. vblank),
> > > > > > > > > it is. (See https://lwn.net/Articles/809545/ for the best explaination
> > > > > > > > > that I found.)
> > > > > > > > >
> > > > > > > > > But this presents a problem with using workqueues for non-blocking
> > > > > > > > > atomic commit_work(), because the SCHED_FIFO userspace thread(s) can
> > > > > > > > > preempt the worker. Which is not really the outcome you want.. once
> > > > > > > > > the required fences are scheduled, you want to push the atomic commit
> > > > > > > > > down to hw ASAP.
> > > > > > > > >
> > > > > > > > > But the decision of whether commit_work should be RT or not really
> > > > > > > > > depends on what userspace is doing. For a pure CFS userspace display
> > > > > > > > > pipeline, commit_work() should remain SCHED_NORMAL.
> > > > > > > > >
> > > > > > > > > To handle this, convert non-blocking commit_work() to use per-CRTC
> > > > > > > > > kthread workers, instead of system_unbound_wq. Per-CRTC workers are
> > > > > > > > > used to avoid serializing commits when userspace is using a per-CRTC
> > > > > > > > > update loop. And the last patch exposes the task id to userspace as
> > > > > > > > > a CRTC property, so that userspace can adjust the priority and sched
> > > > > > > > > policy to fit it's needs.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > v2: Drop client cap and in-kernel setting of priority/policy in
> > > > > > > > > favor of exposing the kworker tid to userspace so that user-
> > > > > > > > > space can set priority/policy.
> > > > > > > >
> > > > > > > > Yeah I think this looks more reasonable. Still a bit irky interface,
> > > > > > > > so I'd like to get some kworker/rt ack on this. Other opens:
> > > > > > > > - needs userspace, the usual drill
> > > > > > >
> > > > > > > fwiw, right now the userspace is "modetest + chrt".. *probably* the
> > > > > > > userspace will become a standalone helper or daemon, mostly because
> > > > > > > the chrome gpu-process sandbox does not allow setting SCHED_FIFO. I'm
> > > > > > > still entertaining the possibility of switching between rt and cfs
> > > > > > > depending on what is in the foreground (ie. only do rt for android
> > > > > > > apps).
> > > > > > >
> > > > > > > > - we need this also for vblank workers, otherwise this wont work for
> > > > > > > > drivers needing those because of another priority inversion.
> > > > > > >
> > > > > > > I have a thought on that, see below..
> > > > > >
> > > > > > Hm, not seeing anything about vblank worker below?
> > > > > >
> > > > > > > > - we probably want some indication of whether this actually does
> > > > > > > > something useful, not all drivers use atomic commit helpers. Not sure
> > > > > > > > how to do that.
> > > > > > >
> > > > > > > I'm leaning towards converting the other drivers over to use the
> > > > > > > per-crtc kwork, and then dropping the 'commit_work` from atomic state.
> > > > > > > I can add a patch to that, but figured I could postpone that churn
> > > > > > > until there is some by-in on this whole idea.
> > > > > >
> > > > > > i915 has its own commit code, it's not even using the current commit
> > > > > > helpers (nor the commit_work). Not sure how much other fun there is.
> > > > >
> > > > > I don't think we want per-crtc threads for this in i915. Seems
> > > > > to me easier to guarantee atomicity across multiple crtcs if
> > > > > we just commit them from the same thread.
> > > >
> > > > Oh, and we may have to commit things in a very specific order
> > > > to guarantee the hw doesn't fall over, so yeah definitely per-crtc
> > > > thread is a no go.
> > >
> > > If I'm understanding the i915 code, this is only the case for modeset
> > > commits? I suppose we could achieve the same result by just deciding
> > > to pick the kthread of the first CRTC for modeset commits. I'm not
> > > really so much concerned about parallelism for modeset.
> >
> > I'm not entirely happy about the random differences between modesets
> > and other commits. Ideally we wouldn't need any.
> >
> > Anyways, even if we ignore modesets we still have the issue with
> > atomicity guarantees across multiple crtcs. So I think we still
> > don't want per-crtc threads, rather it should be thread for each
> > commit.
> >
> > Well, if the crtcs aren't running in lockstep then maybe we could
> > shove them off to separate threads, but that'll just complicate things
> > needlessly I think since we'd need yet another way to iterate
> > the crtcs in each thread. With the thread-per-commit apporach we
> > can just use the normal atomic iterators.
> >
> > >
> > > > I don't even understand the serialization argument. If the commits
> > > > are truly independent then why isn't the unbound wq enough to avoid
> > > > the serialization? It should just spin up a new thread for each commit
> > > > no?
> > >
> > > The problem with wq is prioritization and SCHED_FIFO userspace
> > > components stomping on the feet of commit_work. That is the entire
> > > motivation of this series in the first place, so no we cannot use
> > > unbound wq.
> >
> > This is a bit dejavu of the vblank worker discussion, where I actually
> > did want a per-crtc RT kthread but people weren't convinced they
> > actually help. The difference is that for vblank workers we actually
> > tried to get some numbers, here I've not seen any.
>
> The problem here is priority inversion, not latency: Android runs
> surface-flinger as SCHED_FIFO, so when surfaceflinger does something it
> can preempt the kernel's commit work, and we miss a frame. Apparently
> otherwise the soft-rt of just having a normal worker (with maybe elevated
> niceness) seems nice enough.
yes, exactly, this is about priority inversion.
Not sure if this is clear (you can't really fit all the relevant parts
of the trace in one screenshot), but here is an example of commit_work
preempted by SF EventThread and missing a deadline:
https://usercontent.irccloud-cdn.com/file/Awgp8Sdj/image.png
BR,
-R
>
> Aside: I just double-checked, and vblank work has a per-crtc kthread.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Powered by blists - more mailing lists