lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aAjAZH7m96oz4ohC@pollux>
Date: Wed, 23 Apr 2025 12:26:44 +0200
From: Danilo Krummrich <dakr@...nel.org>
To: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
Cc: phasta@...nel.org, Lyude Paul <lyude@...hat.com>,
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
	Matthew Brost <matthew.brost@...el.com>,
	Christian König <ckoenig.leichtzumerken@...il.com>,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Maxime Ripard <mripard@...nel.org>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/5] drm/sched: Warn if pending list is not empty

On Wed, Apr 23, 2025 at 11:10:51AM +0100, Tvrtko Ursulin wrote:
> 
> On 23/04/2025 09:48, Danilo Krummrich wrote:
> > On Wed, Apr 23, 2025 at 08:34:08AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > IMO it is better to leave it. Regardless of whether it was added because
> > > some driver is actually operating like that, it does describe a _currently_
> > > workable option to avoid memory leaks. Once a better method is there, ie.
> > > FIXME is addressed, then it can be removed or replaced.
> > 
> > I'm not willing to sign off on encouraging drivers to rely on scheduler
> > internals -- also not in this case, sorry.
> > 
> > Our primary goal with the scheduler is to *remove* such broken contracts where
> > drivers rely on scheduler internal implementation details, mess with scheduler
> > internal data structures etc. This is clearly a step back.
> > 
> > And AFAICT, as by now drivers either do a) or simply nothing (with the exception
> > of the mock scheduler). Drivers can do a) in the meantime, there's no reason at
> > all to additionally offer b).
> 
> What mechanism do we currently have to enable using a), and which you would
> not consider needing knowledge of scheduler internals?

The rule is that drivers must not call drm_sched_fini() before all jobs are
processed.

For this, drivers may reference count their scheduler "subclass" for each job
in flight and only call drm_sched_fini() once the reference count becomes zero,
or keep their own list of jobs in flight and just wait for the list to become
empty.

This is possible with the regular API and none of this requires relying on
scheduler internals or messing with scheduler internal data structures.

The problem is limited to the aspect that the GPU scheduler does not provide an
API for drivers to solve this problem.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ