[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<PAXPR04MB85107D8AB628CC9814C9B230886CA@PAXPR04MB8510.eurprd04.prod.outlook.com>
Date: Wed, 4 Jun 2025 06:00:54 +0000
From: Wei Fang <wei.fang@....com>
To: "Abhishek Chauhan (ABC)" <quic_abchauha@...cinc.com>
CC: Florian Fainelli <f.fainelli@...il.com>, "andrew@...n.ch"
<andrew@...n.ch>, "hkallweit1@...il.com" <hkallweit1@...il.com>,
"linux@...linux.org.uk" <linux@...linux.org.uk>, "davem@...emloft.net"
<davem@...emloft.net>, "edumazet@...gle.com" <edumazet@...gle.com>,
"kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>,
"xiaolei.wang@...driver.com" <xiaolei.wang@...driver.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"imx@...ts.linux.dev" <imx@...ts.linux.dev>
Subject: RE: [PATCH v2 net] net: phy: clear phydev->devlink when the link is
deleted
> > On 5/23/2025 1:37 AM, Wei Fang wrote:
> >> There is a potential crash issue when disabling and re-enabling the
> >> network port. When disabling the network port, phy_detach() calls
> >> device_link_del() to remove the device link, but it does not clear
> >> phydev->devlink, so phydev->devlink is not a NULL pointer. Then the
> >> network port is re-enabled, but if phy_attach_direct() fails before
> >> calling device_link_add(), the code jumps to the "error" label and
> >> calls phy_detach(). Since phydev->devlink retains the old value from
> >> the previous attach/detach cycle, device_link_del() uses the old value,
> >> which accesses a NULL pointer and causes a crash. The simplified crash
> >> log is as follows.
> >>
> >> [ 24.702421] Call trace:
> >> [ 24.704856] device_link_put_kref+0x20/0x120
> >> [ 24.709124] device_link_del+0x30/0x48
> >> [ 24.712864] phy_detach+0x24/0x168
> >> [ 24.716261] phy_attach_direct+0x168/0x3a4
> >> [ 24.720352] phylink_fwnode_phy_connect+0xc8/0x14c
> >> [ 24.725140] phylink_of_phy_connect+0x1c/0x34
> >>
> >> Therefore, phydev->devlink needs to be cleared when the device link is
> >> deleted.
> >>
> >> Fixes: bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev")
> >> Signed-off-by: Wei Fang <wei.fang@....com>
> >
> @Wei
> What happens in case of shared mdio ?
>
> 1. Device 23040000 has the mdio node of both the ethernet phy and device
> 23000000 references the phy-handle present in the Device 23040000
> 2. When rmmod of the driver happens
> 3. the parent devlink is already deleted.
> 4. This cause the child mdio to access an entry causing a corruption.
> 5. Thought this fix would help but i see that its not helping the case.
>
My patch is only to fix the potential crash issue when re-enabling
the network interface. phy_detach() is not called when the MDIO
controller driver is removed. So phydev->devlink is not cleared, but
actually the device link has been removed by phy_device_remove()
--> device_del(). Therefore, it will cause the crash when the MAC
controller driver is removed.
> Wondering if this is a legacy issue with shared mdio framework.
>
I think this issue is also introduced by the commit bc66fa87d4fd
("net: phy: Add link between phy dev and mac dev"). I suggested
to change the DL_FLAG_STATELESS flag to
DL_FLAG_AUTOREMOVE_SUPPLIER to solve this issue, so that
the consumer (MAC controller) driver will be automatically removed
when the link is removed. The changes are as follows.
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 73f9cb2e2844..a6d7acd73391 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1515,6 +1515,7 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
struct mii_bus *bus = phydev->mdio.bus;
struct device *d = &phydev->mdio.dev;
struct module *ndev_owner = NULL;
+ struct device_link *devlink;
bool using_genphy = false;
int err;
@@ -1646,9 +1647,16 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
* another mac interface, so we should create a device link between
* phy dev and mac dev.
*/
- if (dev && phydev->mdio.bus->parent && dev->dev.parent != phydev->mdio.bus->parent)
- phydev->devlink = device_link_add(dev->dev.parent, &phydev->mdio.dev,
- DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+ if (dev && phydev->mdio.bus->parent &&
+ dev->dev.parent != phydev->mdio.bus->parent) {
+ devlink = device_link_add(dev->dev.parent, &phydev->mdio.dev,
+ DL_FLAG_PM_RUNTIME |
+ DL_FLAG_AUTOREMOVE_SUPPLIER);
+ if (!devlink) {
+ err = -ENOMEM;
+ goto error;
+ }
+ }
return err;
@@ -1749,11 +1757,6 @@ void phy_detach(struct phy_device *phydev)
struct module *ndev_owner = NULL;
struct mii_bus *bus;
- if (phydev->devlink) {
- device_link_del(phydev->devlink);
- phydev->devlink = NULL;
- }
-
if (phydev->sysfs_links) {
if (dev)
sysfs_remove_link(&dev->dev.kobj, "phydev");
diff --git a/include/linux/phy.h b/include/linux/phy.h
index e194dad1623d..cc1f45c3ff21 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -505,8 +505,6 @@ struct macsec_ops;
*
* @mdio: MDIO bus this PHY is on
* @drv: Pointer to the driver for this PHY instance
- * @devlink: Create a link between phy dev and mac dev, if the external phy
- * used by current mac interface is managed by another mac interface.
* @phyindex: Unique id across the phy's parent tree of phys to address the PHY
* from userspace, similar to ifindex. A zero index means the PHY
* wasn't assigned an id yet.
@@ -610,8 +608,6 @@ struct phy_device {
/* And management functions */
const struct phy_driver *drv;
- struct device_link *devlink;
-
u32 phyindex;
u32 phy_id;
Powered by blists - more mailing lists