[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yn08TdfdsFjKx+qw@slm.duckdns.org>
Date: Thu, 12 May 2022 06:56:45 -1000
From: Tejun Heo <tj@...nel.org>
To: Dmitry Torokhov <dmitry.torokhov@...il.com>
Cc: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 (repost)] workqueue: Warn flushing of kernel-global
workqueues
Hello, Dmitry.
On Thu, May 12, 2022 at 06:13:35AM -0700, Dmitry Torokhov wrote:
> > > This means that now the code has to keep track of all work items that it
> > > allocated, instead of being able "fire and forget" works (when dealing
> > > with extremely infrequent events) and rely on flush_workqueue() to
> > > cleanup.
> >
> > Yes. Moreover, a patch to catch and refuse at compile time was proposed at
> > https://lkml.kernel.org/r/738afe71-2983-05d5-f0fc-d94efbdf7634@I-love.SAKURA.ne.jp .
>
> My comment was not a wholesale endorsement of Tejun's statement, but
> rather a note of the fact that it again adds complexity (at least as far
> as driver writers are concerned) to the kernel code.
I was more thinking about cases where there are a small number of static
work items. If there are multiple dynamic work items, creating a workqueue
as a flush domain is the way to go. It does add a bit of complexity but
shouldn't be too bad - e.g. it just adds the alloc_workqueue() during init
and the exit path can simply use destroy_workqueue() instead of flush.
> > > That flush typically happens in module unload path, and I
> > > wonder if the restriction on flush_workqueue() could be relaxed to allow
> > > calling it on unload.
> >
> > A patch for drivers/input/mouse/psmouse-smbus.c is waiting for your response at
> > https://lkml.kernel.org/r/25e2b787-cb2c-fb0d-d62c-6577ad1cd9df@I-love.SAKURA.ne.jp .
> > Like many modules, flush_workqueue() happens on only module unload in your case.
>
> Yes, I saw that patch, and that is what prompted my response. I find it
> adding complexity and I was wondering if it could be avoided. It also
> unclear to me if there is an additional cost coming from allocating a
> dedicated workqueue.
A workqueue without WQ_RECLAIM is really cheap. All it does is tracking
what's in flight for that particular frontend while interfacing with the
shared worker pool.
> I understand that for some of them the change makes sense, but it would
> be nice to continue using simple API under limited circumstances.
Hmmm... unfortunately, I can't think of a way to guarantee that a module
unloading path can't get involved in a deadlock scenario through system_wq.
Given that the added complexity should be something like half a dozen lines
of code, switching to separte workqueues feels like the right direction to
me.
Thanks.
--
tejun
Powered by blists - more mailing lists