[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aC8Dzgufa9E2MD6t@pollux>
Date: Thu, 22 May 2025 13:00:30 +0200
From: Danilo Krummrich <dakr@...nel.org>
To: Rob Clark <robdclark@...il.com>
Cc: Connor Abbott <cwabbott0@...il.com>, Rob Clark <robdclark@...omium.org>,
phasta@...nel.org, dri-devel@...ts.freedesktop.org,
freedreno@...ts.freedesktop.org, linux-arm-msm@...r.kernel.org,
Matthew Brost <matthew.brost@...el.com>,
Christian König <ckoenig.leichtzumerken@...il.com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
open list <linux-kernel@...r.kernel.org>,
Boris Brezillon <boris.brezillon@...labora.com>
Subject: Re: [PATCH v4 04/40] drm/sched: Add enqueue credit limit
On Tue, May 20, 2025 at 10:22:54AM -0700, Rob Clark wrote:
> On Tue, May 20, 2025 at 9:54 AM Danilo Krummrich <dakr@...nel.org> wrote:
> > On Tue, May 20, 2025 at 09:07:05AM -0700, Rob Clark wrote:
> > > On Tue, May 20, 2025 at 12:06 AM Danilo Krummrich <dakr@...nel.org> wrote:
> > > > But let's assume we agree that we want to avoid that userspace can ever OOM itself
> > > > through async VM_BIND, then the proposed solution seems wrong:
> > > >
> > > > Do we really want the driver developer to set an arbitrary boundary of a number
> > > > of jobs that can be submitted before *async* VM_BIND blocks and becomes
> > > > semi-sync?
> > > >
> > > > How do we choose this number of jobs? A very small number to be safe, which
> > > > scales badly on powerful machines? A large number that scales well on powerful
> > > > machines, but OOMs on weaker ones?
> > >
> > > The way I am using it in msm, the credit amount and limit are in units
> > > of pre-allocated pages in-flight. I set the enqueue_credit_limit to
> > > 1024 pages, once there are jobs queued up exceeding that limit, they
> > > start blocking.
> > >
> > > The number of _jobs_ is irrelevant, it is # of pre-alloc'd pages in flight.
> >
> > That doesn't make a difference for my question. How do you know 1024 pages is a
> > good value? How do we scale for different machines with different capabilities?
> >
> > If you have a powerful machine with lots of memory, we might throttle userspace
> > for no reason, no?
> >
> > If the machine has very limited resources, it might already be too much?
>
> It may be a bit arbitrary, but then again I'm not sure that userspace
> is in any better position to pick an appropriate limit.
>
> 4MB of in-flight pages isn't going to be too much for anything that is
> capable enough to run vk, but still allows for a lot of in-flight
> maps.
Ok, but what about the other way around? What's the performance impact if the
limit is chosen rather small, but we're running on a very powerful machine?
Since you already have the implementation for hardware you have access to, can
you please check if and how performance degrades when you use a very small
threshold?
Also, I think we should probably put this throttle mechanism in a separate
component, that just wraps a counter of bytes or rather pages that can be
increased and decreased through an API and the increase just blocks at a certain
threshold.
This component can then be called by a driver from the job submit IOCTL and the
corresponding place where the pre-allocated memory is actually used / freed.
Depending on the driver, this might not necessarily be in the scheduler's
run_job() callback.
We could call the component something like drm_throttle or drm_submit_throttle.
Powered by blists - more mailing lists