[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aP/rHN4Ei04Tj49n@lstrano-desk.jf.intel.com>
Date: Mon, 27 Oct 2025 14:58:52 -0700
From: Matthew Brost <matthew.brost@...el.com>
To: Tejun Heo <tj@...nel.org>
CC: <intel-xe@...ts.freedesktop.org>, <dri-devel@...ts.freedesktop.org>,
<linux-kernel@...r.kernel.org>, <jiangshanlai@...il.com>,
<simona.vetter@...ll.ch>, <christian.koenig@....com>, <pstanner@...hat.com>,
<dakr@...nel.org>
Subject: Re: [RFC PATCH 1/3] workqueue: Add an interface to taint workqueue
lockdep with reclaim
On Tue, Oct 21, 2025 at 03:51:08PM -1000, Tejun Heo wrote:
> On Tue, Oct 21, 2025 at 06:22:19PM -0700, Matthew Brost wrote:
> > > please update the patch so that WQ_MEM_RECLAIM implicitly enables the
> > > checking?
> >
> > Sure, but a bunch of things immediately break—including a convoluted
> > case in my driver. I can fix the kernel to the extent that my CI catches
> > issues, and fix any obvious cases through manual inspection. However,
> > I suspect that if we merge this, we'll be dealing with fallout
> > throughout a kernel RC cycle.
>
> Sure, we're still early in this cycle and can try to resolve as much as
> possible and if there's just too much, we can make it optional and so on.
>
I’ve come to the conclusion that the entire kernel is going to explode
if all WQ_MEM_RECLAIM workqueues are annotated with lockdep. I made this
change and tried booting Linux, but quickly ran into five issues before
giving up. It seems that many parts of the kernel allocate memory with
GFP_KERNEL or take locks that allocate memory in workqueues marked with
WQ_MEM_RECLAIM.
There are literally hundreds of workqueues created with WQ_MEM_RECLAIM,
both directly [1] and via the create*workqueue helpers, which also set
this flag.
So, what’s your advice here? Personally, I don’t have the bandwidth to
drive fixing the entire kernel. Maybe we could add the annotation
helpers introduced in this series, so drivers that really want to
enforce this rule—like the DRM driver—can opt in? And perhaps we could
add a export Kconfig option that enables this for all WQ_MEM_RECLAIM
workqueues, and let the community gradually enable it and start fixing
their code?
Matt
[1] https://elixir.bootlin.com/linux/v6.17.5/C/ident/WQ_MEM_RECLAIM
> Thanks.
>
> --
> tejun
Powered by blists - more mailing lists