lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y4dXsNKve02fGGEl@alley>
Date:   Wed, 30 Nov 2022 14:16:32 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     David Laight <David.Laight@...lab.com>
Cc:     'Petr Pavlu' <petr.pavlu@...e.com>,
        "mcgrof@...nel.org" <mcgrof@...nel.org>,
        "prarit@...hat.com" <prarit@...hat.com>,
        "david@...hat.com" <david@...hat.com>,
        "mwilck@...e.com" <mwilck@...e.com>,
        "linux-modules@...r.kernel.org" <linux-modules@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH] module: Don't wait for GOING modules

On Sun 2022-11-27 11:21:45, David Laight wrote:
> From: Petr Pavlu
> > Sent: 26 November 2022 14:43
> > 
> > On 11/23/22 16:29, Petr Mladek wrote:
> > > On Wed 2022-11-23 14:12:26, Petr Pavlu wrote:
> > >> During a system boot, it can happen that the kernel receives a burst of
> > >> requests to insert the same module but loading it eventually fails
> > >> during its init call. For instance, udev can make a request to insert
> > >> a frequency module for each individual CPU when another frequency module
> > >> is already loaded which causes the init function of the new module to
> > >> return an error.
> > >>
> > >> Since commit 6e6de3dee51a ("kernel/module.c: Only return -EEXIST for
> > >> modules that have finished loading"), the kernel waits for modules in
> > >> MODULE_STATE_GOING state to finish unloading before making another
> > >> attempt to load the same module.
> > >>
> > >> This creates unnecessary work in the described scenario and delays the
> > >> boot. In the worst case, it can prevent udev from loading drivers for
> > >> other devices and might cause timeouts of services waiting on them and
> > >> subsequently a failed boot.
> > >>
> > >> This patch attempts a different solution for the problem 6e6de3dee51a
> > >> was trying to solve. Rather than waiting for the unloading to complete,
> > >> it returns a different error code (-EBUSY) for modules in the GOING
> > >> state. This should avoid the error situation that was described in
> > >> 6e6de3dee51a (user space attempting to load a dependent module because
> > >> the -EEXIST error code would suggest to user space that the first module
> > >> had been loaded successfully), while avoiding the delay situation too.
> > >>
> 
> While people have all this code cached in their brains
> there is related problem I can easily hit.
> 
> If two processes create sctp sockets at the same time and sctp
> module has to be loaded then the second process can enter the
> module code before is it fully initialised.
> This might be because the try_module_get() succeeds before the
> module initialisation function returns.

Right, the race is there. And it is true that nobody should use
the module until mod->init() succeeds.

Well, I am not sure if there is an easy solution. It might require
reviewing what all try_module_get() callers expect.

We could not easily wait. For example, __sock_create() calls
try_module_get() under rcu_read_lock().

And various callers might want special handing when the module
is coming, going, and when it is not there at all.

I guess that it would require adding some new API and update
the various callers.

> I've avoided the issue by ensuring the socket creates are serialised.

I see. It would be great to have a clean solution, definitely.

Sigh, there are more issues with the module life time. For example,
kobjects might call the release() callback asynchronously and
it might happen when the module/code has gone, see
https://lore.kernel.org/all/20211105063710.4092936-1-ming.lei@redhat.com/

Best Regards,
PEtr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ