lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFF3YIAFkgsAKvQV@pollux>
Date: Tue, 17 Jun 2025 16:10:40 +0200
From: Danilo Krummrich <dakr@...nel.org>
To: Philipp Stanner <phasta@...nel.org>,
	Matthew Brost <matthew.brost@...el.com>,
	Christian König <ckoenig.leichtzumerken@...il.com>,
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
	Sumit Semwal <sumit.semwal@...aro.org>,
	dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
	linux-media@...r.kernel.org
Subject: Re: [PATCH v2] drm/sched: Clarify scenarios for separate workqueues

On Tue, Jun 17, 2025 at 03:51:33PM +0200, Simona Vetter wrote:
> On Thu, Jun 12, 2025 at 04:49:54PM +0200, Philipp Stanner wrote:
> > + * NOTE that sharing &struct drm_sched_init_args.submit_wq with the driver
> > + * theoretically can deadlock. It must be guaranteed that submit_wq never has
> > + * more than max_active - 1 active tasks, or if max_active tasks are reached at
> > + * least one of them does not execute operations that may block on dma_fences
> > + * that potentially make progress through this scheduler instance. Otherwise,
> > + * it is possible that all max_active tasks end up waiting on a dma_fence (that
> > + * can only make progress through this schduler instance), while the
> > + * scheduler's queued work waits for at least one of the max_active tasks to
> > + * finish. Thus, this can result in a deadlock.
> 
> Uh if you have an ordered wq you deadlock with just one misuse. I'd just
> explain that the wq must provide sufficient forward-progress guarantees
> for the scheduler, specifically that it's on the dma_fence signalling
> critical path and leave the concrete examples for people to figure out
> when the design a specific locking scheme.

This isn't a concrete example, is it? It's exactly what you say in slightly
different words, with the addition of highlighting the impact of the workqueue's
max_active configuration.

I think that's relevant, because N - 1 active tasks can be on the dma_fence
signalling critical path without issues.

We could change

	"if max_active tasks are reached at least one of them must not execute
	 operations that may block on dma_fences that potentially make progress
	 through this scheduler instance"

to 

	"if max_active tasks are reached at least one of them must not be on the
	 dma_fence signalling critical path"

which is a bit more to the point I think.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ