netdev - Re: Race in PHY subsystem? Attaching to PHY devices before they get probed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <65b12597.050a0220.66e91.7b3b@mx.google.com>
Date: Wed, 24 Jan 2024 15:58:28 +0100
From: Christian Marangi <ansuelsmth@...il.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew@...n.ch>,
	Rafał Miłecki <zajec5@...il.com>,
	Network Development <netdev@...r.kernel.org>,
	Heiner Kallweit <hkallweit1@...il.com>,
	Robert Marko <robimarko@...il.com>,
	Daniel Golle <daniel@...rotopia.org>
Subject: Re: Race in PHY subsystem? Attaching to PHY devices before they get
 probed

On Mon, Jan 22, 2024 at 04:56:18PM +0000, Russell King (Oracle) wrote:
> On Mon, Jan 22, 2024 at 03:12:42PM +0100, Andrew Lunn wrote:
> > On Mon, Jan 22, 2024 at 08:09:58AM +0100, Rafał Miłecki wrote:
> > > Hi!
> > > 
> > > I have MT7988 SoC board with following problem:
> > > [   26.887979] Aquantia AQR113C mdio-bus:08: aqr107_wait_reset_complete failed: -110
> > > 
> > > This issue is known to occur when PHY's firmware is not running. After
> > > some debugging I discovered that .config_init() CB gets called while
> > > .probe() CB is still being executed.
> > > 
> > > It turns out mtk_soc_eth.c calls phylink_of_phy_connect() before my PHY
> > > gets fully probed and it seems there is nothing in PHY subsystem
> > > verifying that. Please note this PHY takes quite some time to probe as
> > > it involves sending firmware to hardware.
> > > 
> > > Is that a possible race in PHY subsystem?
> > 
> > Seems like it.
> > 
> > There is a patch "net: phylib: get rid of unnecessary locking" which
> > removed locks from probe, which might of helped, but the patch also
> > says:
> > 
> >     The locking in phy_probe() and phy_remove() does very little to prevent
> >     any races with e.g. phy_attach_direct(),
> > 
> > suggesting it probably did not help.
> 
> The reason for that statement is because phy_attach_direct() doesn't
> take phydev->lock _at all_, so taking the lock in phy_probe() has no
> effect on any race with phy_attach_direct().
> 
> > I think the traditional way problems like this are avoided is that the
> > device should not be visible to the rest of the system until probe has
> > completed.
> 
> However, we have the problem of the generic driver fallback - which
> phy_attach_direct() does.
> 
> The probe vs phy_attach_direct() has been racy for quite a long time,
> and the poking about that's done in that function such as assigning
> struct device's driver member, calling device_bind_driver() etc is
> all hellishly racy if the phy_device _could_ be bound simultaneously.
> 
> Also note this... we call device_bind_driver() from phy_attach_direct(),
> and the documentation for this function states:
> 
>  * This function must be called with the device lock held.
> 
> which we don't do. So we're already violating the locking requirements
> for the driver model.
> 
> So, I would suggest that the solution is to make use of device_lock()
> which will also only return once a probe has succeeded.
> 
> However, that's still not ideal - because the fact we have a race here
> means that what could happen is that phy_attach_direct() is called
> a little earlier than the probe begins, and the phy device ends up
> being bound to the generic PHY driver rather than its specific driver.
> 
> I think what this comes down to are the following points:
> 
> 1) not using the required device model locking
> 2) the mere existence of the default driver makes for a race between
>    the PHY being attached and its driver being probed.
> 
> No amount of locking saves us from (2) - the only solutions that I can
> see to this are:
> 1) to put up with there being such a race
> 2) get rid of the default drivers altogether and insist that we have
>    specific PHY drivers for _all_ PHYs
> 3) have some kind of retry mechanism
> 
> A further problem is... we can't simply return -EPROBE_DEFER from
> phy_attach_direct() because this function may not be called from
> probe functions - it may be called from the .ndo_open method which
> has no idea how to handle a probe deferal. Moreover, returning an
> error to userspace will just cause it to fail (because all errors
> from trying to bring a netdev up are considered to be fatal.)
> 
> So, it's a really yucky problem, and I don't see any nice and simple
> solution.
>

Well if we start having more and more PHY that require loading a FW then
this will become a big problem...

I wasted some good time on this and if the MDIO is slow enough loading
the FW can take even 100s resulting in probe still having to finish and
config_init called later.

Since the FW has not been loaded config_init returns bad data and fails
to configure. (and after a while probe is complete)

I don't know if it would be ok as a solution but I think moving the
fw_load call in the config_init seems to "handle" this problem but IMHO
it's still and hack for a fragile implementation.

-- 
	Ansuel