[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yfhbcr7Ow/HGmHIo@google.com>
Date: Mon, 31 Jan 2022 13:58:10 -0800
From: Igor Pylypiv <ipylypiv@...gle.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Luis Chamberlain <mcgrof@...nel.org>, Tejun Heo <tj@...nel.org>,
linux-modules@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Changyuan Lyu <changyuanl@...gle.com>
Subject: Re: [PATCH] Revert "module, async: async_synchronize_full() on
module init iff async is used"
On Fri, Jan 28, 2022 at 09:39:12AM +0200, Linus Torvalds wrote:
> On Fri, Jan 28, 2022 at 1:40 AM Igor Pylypiv <ipylypiv@...gle.com> wrote:
> >
> > This reverts commit 774a1221e862b343388347bac9b318767336b20b.
>
> Whee. That's from early 2013, more than 9 years ago.
>
> > This works when modprobe thread is calling async_schedule(), but it
> > does not work if module dispatches init code to a worker thread which
> > then calls async_schedule().
>
> Hmm.
>
> You make a good argument:
>
> > Commit 21c3c5d28007 ("block: don't request module during elevator init")
> > fixed the deadlock issue which the reverted commit 774a1221e862 ("module,
> > async: async_synchronize_full() on module init iff async is used") tried
> > to fix.
> >
> > Since commit 0fdff3ec6d87 ("async, kmod: warn on synchronous
> > request_module() from async workers") synchronous module loading
> > from async is not allowed.
>
> This does seem to imply that limiting it to PF_USED_ASYNC is largely
> pointless, at least for the originally stated reason of deadlocks with
> other module loading.
>
> However, we've done this for *so* long that I wonder if there might be
> situations that have ended up depending on the lack of synchronization
> for pure performance reasons.
>
> If *this* module loading process started the async work, then we'd
> wait for it, but what if there's other async work that was started by
> others? This revert would now make us wait for that async work too,
> and that might be a big deal slowing things down at boot time.
Thanks Linus! It is not like we have no wait at all today, we have it
for drivers that call async directly (not through a worker). A potential
future refactor of some driver to start using async may also "suddenly"
enable synchronization through PF_USED_ASYNC.
In case someone encounters a noticeable boot time slowdown they can pass
the 'async_probe' parameter to probe a waiting module asynchronously.
>
> Looking at it, this is all under the 'module_mutex', so I guess we are
> already single-threaded at least wrt loading other modules, so the
> amount of unrelated async work going on is presumably fairly low and
> that isn't an issue.
>
> Anyway, I think this patch is the right thing to do, but just the fact
> that we've avoided that async wait for so long makes me a bit nervous
> about fallout from the revert.
>
> Comments? Maybe this is a "just apply it, see if somebody screams" situation?
>
> Linus
Powered by blists - more mailing lists