[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YjitUuIKTWbpX1/G@slm.duckdns.org>
Date: Mon, 21 Mar 2022 06:52:34 -1000
From: Tejun Heo <tj@...nel.org>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc: Lai Jiangshan <jiangshanlai@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v3 (repost)] workqueue: Warn flushing of kernel-global
workqueues
Hello, Tetsuo.
On Sat, Mar 19, 2022 at 01:42:59PM +0900, Tetsuo Handa wrote:
> Since flush operation synchronously waits for completion, flushing
> kernel-global WQs (e.g. system_wq) might introduce possibility of deadlock
> due to unexpected locking dependency. Tejun Heo commented that it makes no
> sense at all to call flush_workqueue() on the shared WQs as the caller has
> no idea what it's gonna end up waiting for.
>
> Although there is flush_scheduled_work() which flushes system_wq WQ with
> "Think twice before calling this function! It's very easy to get into
> trouble if you don't take great care." warning message, syzbot found a
> circular locking dependency caused by flushing system_long_wq WQ [1].
> Therefore, let's change the direction to that developers had better use
> their local WQs if flush_workqueue() is inevitable.
>
> To give developers time to update their modules, for now just emit
> a warning message with ratelimit when flush_workqueue() is called on
> kernel-global WQs. We will eventually convert this warning message into
> WARN_ON() and kill flush_scheduled_work().
>
> This patch introduces __WQ_NO_FLUSH flag for emitting warning. Don't set
> this flag when creating your local WQs while updating your module, for
> destroy_workqueue() will involve flush operation.
>
> Theoretically, flushing specific work item using flush_work() queued on
> kernel-global WQs (which are !WQ_MEM_RECLAIM) has possibility of deadlock.
> But this patch does not emit warning when flush_work() is called on work
> items queued on kernel-global WQs, based on assumption that we can create
> kworker threads as needed and we won't hit max_active limit.
As noted in the other thread, we can float this sort of patches in -next to
suss out ones which aren't immediately obvious - e.g. something saves a
pointer to one of the system workqueues and flushes that, but if this is to
happen, somebody has to do the most of the legwork to convert most if not
all of the existing users, which isn't necessarily difficult but is tedious,
which is why it hasn't been done yet. So, if you wanna take on this, you
gotta take on the conversions too. Just declaring it so doesn't really work.
Thanks.
--
tejun
Powered by blists - more mailing lists