[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1898857d-d580-4fa5-8f19-6e91114975a1@lunn.ch>
Date: Fri, 3 Jan 2025 16:37:33 +0100
From: Andrew Lunn <andrew@...n.ch>
To: Niklas Cassel <cassel@...nel.org>
Cc: "Russell King (Oracle)" <linux@...linux.org.uk>,
Francesco Valla <francesco@...la.it>,
Andrew Lunn <andrew+netdev@...n.ch>,
Heiner Kallweit <hkallweit1@...il.com>, netdev@...r.kernel.org,
Anand Moon <linux.amoon@...il.com>,
Luis Chamberlain <mcgrof@...nel.org>
Subject: Re: [PATCH] net: phy: don't issue a module request if a driver is
available
On Fri, Jan 03, 2025 at 03:34:26PM +0100, Niklas Cassel wrote:
> On Fri, Jan 03, 2025 at 03:12:14PM +0100, Andrew Lunn wrote:
> > > FWIW, the patch in $subject does make the splat go away for me.
> > > (I have the PHY driver built as built-in).
> > >
> > > The patch in $subject does "Add a list of registered drivers and check
> > > if one is already available before resorting to call request_module();
> > > in this way, if the PHY driver is already there, the MDIO bus can perform
> > > the async probe."
> >
> > Lets take a step backwards.
> >
> > How in general should module loading work with async probe? Lets
> > understand that first.
> >
> > Then we can think about what is special about PHYs, and how we can fit
> > into the existing framework.
>
> I agree that it might be a good idea, if possible, for request_module()
> itself to see if the requested module is already registered, and then do
> nothing.
>
> Adding Luis (modules maintainer) to CC.
>
> Luis, phylib calls request_module() unconditionally, regardless if there
> is a driver registered already (phy driver built as built-in).
>
> This causes the splat described here to be printed:
> https://lore.kernel.org/netdev/7103704.9J7NaK4W3v@fedora.fritz.box/T/#u
>
> But as far as I can tell, this is a false positive, since the call cannot
> possibly load any module (since the driver is built as built-in and not
> as a module).
Please be careful here. Just because it is built in for your build
does not mean it is built in for everybody. The code needs to be able
to load the module if it is not built in.
Also, i've seen broken builds where the driver is both built in and
exists as a module. User space will try to load the module, and the
linker will then complain about duplicate symbols and the load fails.
So it is not a false positive. There is also the question of what
exactly causes the deadlock. Is simply calling user space to load a
module which does not exist sufficient to cause the deadlock? That is
what is happening in your system, with built in modules.
Also, the kernel probably has no idea if the module has already been
loaded nor not when request_module() is called. What we pass to the
request_module() is not the name of the module, but a string which
represents one of the IDs the PHY has. It is then userspace that maps
that string to a kernel module via modules.alias. And there is a many
to one mapping, many IDs map to one module. So it could be the module
has already been loaded due to some other string.
So, back to my original question. How should module loading work with
async probe?
Andrew
Powered by blists - more mailing lists