[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231027184143.4427edb8@collabora.com>
Date: Fri, 27 Oct 2023 18:41:43 +0200
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Luben Tuikov <ltuikov89@...il.com>
Cc: Danilo Krummrich <dakr@...hat.com>, matthew.brost@...el.com,
linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
faith@...strand.net, luben.tuikov@....com, christian.koenig@....com
Subject: Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow
control
On Fri, 27 Oct 2023 10:32:52 -0400
Luben Tuikov <ltuikov89@...il.com> wrote:
> On 2023-10-27 04:25, Boris Brezillon wrote:
> > Hi Danilo,
> >
> > On Thu, 26 Oct 2023 18:13:00 +0200
> > Danilo Krummrich <dakr@...hat.com> wrote:
> >
> >> Currently, job flow control is implemented simply by limiting the number
> >> of jobs in flight. Therefore, a scheduler is initialized with a credit
> >> limit that corresponds to the number of jobs which can be sent to the
> >> hardware.
> >>
> >> This implies that for each job, drivers need to account for the maximum
> >> job size possible in order to not overflow the ring buffer.
> >>
> >> However, there are drivers, such as Nouveau, where the job size has a
> >> rather large range. For such drivers it can easily happen that job
> >> submissions not even filling the ring by 1% can block subsequent
> >> submissions, which, in the worst case, can lead to the ring run dry.
> >>
> >> In order to overcome this issue, allow for tracking the actual job size
> >> instead of the number of jobs. Therefore, add a field to track a job's
> >> credit count, which represents the number of credits a job contributes
> >> to the scheduler's credit limit.
> >>
> >> Signed-off-by: Danilo Krummrich <dakr@...hat.com>
> >> ---
> >> Changes in V2:
> >> ==============
> >> - fixed up influence on scheduling fairness due to consideration of a job's
> >> size
> >> - If we reach a ready entity in drm_sched_select_entity() but can't actually
> >> queue a job from it due to size limitations, just give up and go to sleep
> >> until woken up due to a pending job finishing, rather than continue to try
> >> other entities.
> >> - added a callback to dynamically update a job's credits (Boris)
> >
> > This callback seems controversial. I'd suggest dropping it, so the
> > patch can be merged.
>
> Sorry, why is it controversial? (I did read back-and-forth above, but it wasn't clear
> why it is /controversial/.)
That's a question for Christian, I guess. I didn't quite get what he
was worried about, other than this hook introducing a new way for
drivers to screw things up by returning funky/invalid credits (which we
can report with WARN_ON()s). But let's be honest, there's probably a
hundred different ways (if not more) drivers can shoot themselves in the
foot with drm_sched already...
>
> I believe only drivers are privy to changes in the credit availability as their
> firmware and hardware executes new jobs and finishes others, and so this "update"
> here is essential--leaving it only to prepare_job() wouldn't quite fulfill the vision
> of why the credit mechanism introduced by this patch in the first place.
I kinda agree with you, even if I wouldn't so pessimistic as to how
useful this patch would be without the ->update_job_credits() hook
(it already makes the situation a lot better for Nouveau and probably
other drivers too).
Powered by blists - more mailing lists