netdev - Re: Race in PHY subsystem? Attaching to PHY devices before they get probed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Za6eMg0y2QxogfmD@shell.armlinux.org.uk>
Date: Mon, 22 Jan 2024 16:56:18 +0000
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Andrew Lunn <andrew@...n.ch>
Cc: Rafał Miłecki <zajec5@...il.com>,
	Network Development <netdev@...r.kernel.org>,
	Heiner Kallweit <hkallweit1@...il.com>,
	Robert Marko <robimarko@...il.com>,
	Ansuel Smith <ansuelsmth@...il.com>,
	Daniel Golle <daniel@...rotopia.org>
Subject: Re: Race in PHY subsystem? Attaching to PHY devices before they get
 probed

On Mon, Jan 22, 2024 at 03:12:42PM +0100, Andrew Lunn wrote:
> On Mon, Jan 22, 2024 at 08:09:58AM +0100, Rafał Miłecki wrote:
> > Hi!
> > 
> > I have MT7988 SoC board with following problem:
> > [   26.887979] Aquantia AQR113C mdio-bus:08: aqr107_wait_reset_complete failed: -110
> > 
> > This issue is known to occur when PHY's firmware is not running. After
> > some debugging I discovered that .config_init() CB gets called while
> > .probe() CB is still being executed.
> > 
> > It turns out mtk_soc_eth.c calls phylink_of_phy_connect() before my PHY
> > gets fully probed and it seems there is nothing in PHY subsystem
> > verifying that. Please note this PHY takes quite some time to probe as
> > it involves sending firmware to hardware.
> > 
> > Is that a possible race in PHY subsystem?
> 
> Seems like it.
> 
> There is a patch "net: phylib: get rid of unnecessary locking" which
> removed locks from probe, which might of helped, but the patch also
> says:
> 
>     The locking in phy_probe() and phy_remove() does very little to prevent
>     any races with e.g. phy_attach_direct(),
> 
> suggesting it probably did not help.

The reason for that statement is because phy_attach_direct() doesn't
take phydev->lock _at all_, so taking the lock in phy_probe() has no
effect on any race with phy_attach_direct().

> I think the traditional way problems like this are avoided is that the
> device should not be visible to the rest of the system until probe has
> completed.

However, we have the problem of the generic driver fallback - which
phy_attach_direct() does.

The probe vs phy_attach_direct() has been racy for quite a long time,
and the poking about that's done in that function such as assigning
struct device's driver member, calling device_bind_driver() etc is
all hellishly racy if the phy_device _could_ be bound simultaneously.

Also note this... we call device_bind_driver() from phy_attach_direct(),
and the documentation for this function states:

 * This function must be called with the device lock held.

which we don't do. So we're already violating the locking requirements
for the driver model.

So, I would suggest that the solution is to make use of device_lock()
which will also only return once a probe has succeeded.

However, that's still not ideal - because the fact we have a race here
means that what could happen is that phy_attach_direct() is called
a little earlier than the probe begins, and the phy device ends up
being bound to the generic PHY driver rather than its specific driver.

I think what this comes down to are the following points:

1) not using the required device model locking
2) the mere existence of the default driver makes for a race between
   the PHY being attached and its driver being probed.

No amount of locking saves us from (2) - the only solutions that I can
see to this are:
1) to put up with there being such a race
2) get rid of the default drivers altogether and insist that we have
   specific PHY drivers for _all_ PHYs
3) have some kind of retry mechanism

A further problem is... we can't simply return -EPROBE_DEFER from
phy_attach_direct() because this function may not be called from
probe functions - it may be called from the .ndo_open method which
has no idea how to handle a probe deferal. Moreover, returning an
error to userspace will just cause it to fail (because all errors
from trying to bring a netdev up are considered to be fatal.)

So, it's a really yucky problem, and I don't see any nice and simple
solution.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!