[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0b44c0f5-d922-4d89-8244-f114aedafa03@quicinc.com>
Date: Tue, 3 Jun 2025 23:09:49 -0700
From: "Abhishek Chauhan (ABC)" <quic_abchauha@...cinc.com>
To: Wei Fang <wei.fang@....com>
CC: Florian Fainelli <f.fainelli@...il.com>,
"andrew@...n.ch"
<andrew@...n.ch>,
"hkallweit1@...il.com" <hkallweit1@...il.com>,
"linux@...linux.org.uk" <linux@...linux.org.uk>,
"davem@...emloft.net"
<davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"xiaolei.wang@...driver.com" <xiaolei.wang@...driver.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"imx@...ts.linux.dev" <imx@...ts.linux.dev>,
Sarosh Hasan
<quic_sarohasa@...cinc.com>
Subject: Re: [PATCH v2 net] net: phy: clear phydev->devlink when the link is
deleted
On 6/3/2025 11:00 PM, Wei Fang wrote:
>>> On 5/23/2025 1:37 AM, Wei Fang wrote:
>>>> There is a potential crash issue when disabling and re-enabling the
>>>> network port. When disabling the network port, phy_detach() calls
>>>> device_link_del() to remove the device link, but it does not clear
>>>> phydev->devlink, so phydev->devlink is not a NULL pointer. Then the
>>>> network port is re-enabled, but if phy_attach_direct() fails before
>>>> calling device_link_add(), the code jumps to the "error" label and
>>>> calls phy_detach(). Since phydev->devlink retains the old value from
>>>> the previous attach/detach cycle, device_link_del() uses the old value,
>>>> which accesses a NULL pointer and causes a crash. The simplified crash
>>>> log is as follows.
>>>>
>>>> [ 24.702421] Call trace:
>>>> [ 24.704856] device_link_put_kref+0x20/0x120
>>>> [ 24.709124] device_link_del+0x30/0x48
>>>> [ 24.712864] phy_detach+0x24/0x168
>>>> [ 24.716261] phy_attach_direct+0x168/0x3a4
>>>> [ 24.720352] phylink_fwnode_phy_connect+0xc8/0x14c
>>>> [ 24.725140] phylink_of_phy_connect+0x1c/0x34
>>>>
>>>> Therefore, phydev->devlink needs to be cleared when the device link is
>>>> deleted.
>>>>
>>>> Fixes: bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev")
>>>> Signed-off-by: Wei Fang <wei.fang@....com>
>>>
>> @Wei
>> What happens in case of shared mdio ?
>>
>> 1. Device 23040000 has the mdio node of both the ethernet phy and device
>> 23000000 references the phy-handle present in the Device 23040000
>> 2. When rmmod of the driver happens
>> 3. the parent devlink is already deleted.
>> 4. This cause the child mdio to access an entry causing a corruption.
>> 5. Thought this fix would help but i see that its not helping the case.
>>
>
> My patch is only to fix the potential crash issue when re-enabling
> the network interface. phy_detach() is not called when the MDIO
> controller driver is removed. So phydev->devlink is not cleared, but
> actually the device link has been removed by phy_device_remove()
> --> device_del(). Therefore, it will cause the crash when the MAC
> controller driver is removed.
>
>> Wondering if this is a legacy issue with shared mdio framework.
>>
>
> I think this issue is also introduced by the commit bc66fa87d4fd
> ("net: phy: Add link between phy dev and mac dev"). I suggested
> to change the DL_FLAG_STATELESS flag to
> DL_FLAG_AUTOREMOVE_SUPPLIER to solve this issue, so that
> the consumer (MAC controller) driver will be automatically removed
> when the link is removed. The changes are as follows.
>
thanks a lot , Russell and Wei for your prompt response.
I appreciate your help. let me test this change and get back.
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 73f9cb2e2844..a6d7acd73391 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -1515,6 +1515,7 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
> struct mii_bus *bus = phydev->mdio.bus;
> struct device *d = &phydev->mdio.dev;
> struct module *ndev_owner = NULL;
> + struct device_link *devlink;
> bool using_genphy = false;
> int err;
>
> @@ -1646,9 +1647,16 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
> * another mac interface, so we should create a device link between
> * phy dev and mac dev.
> */
> - if (dev && phydev->mdio.bus->parent && dev->dev.parent != phydev->mdio.bus->parent)
> - phydev->devlink = device_link_add(dev->dev.parent, &phydev->mdio.dev,
> - DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
> + if (dev && phydev->mdio.bus->parent &&
> + dev->dev.parent != phydev->mdio.bus->parent) {
> + devlink = device_link_add(dev->dev.parent, &phydev->mdio.dev,
> + DL_FLAG_PM_RUNTIME |
> + DL_FLAG_AUTOREMOVE_SUPPLIER);
> + if (!devlink) {
> + err = -ENOMEM;
> + goto error;
> + }
> + }
>
> return err;
>
> @@ -1749,11 +1757,6 @@ void phy_detach(struct phy_device *phydev)
> struct module *ndev_owner = NULL;
> struct mii_bus *bus;
>
> - if (phydev->devlink) {
> - device_link_del(phydev->devlink);
> - phydev->devlink = NULL;
> - }
> -
> if (phydev->sysfs_links) {
> if (dev)
> sysfs_remove_link(&dev->dev.kobj, "phydev");
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index e194dad1623d..cc1f45c3ff21 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -505,8 +505,6 @@ struct macsec_ops;
> *
> * @mdio: MDIO bus this PHY is on
> * @drv: Pointer to the driver for this PHY instance
> - * @devlink: Create a link between phy dev and mac dev, if the external phy
> - * used by current mac interface is managed by another mac interface.
> * @phyindex: Unique id across the phy's parent tree of phys to address the PHY
> * from userspace, similar to ifindex. A zero index means the PHY
> * wasn't assigned an id yet.
> @@ -610,8 +608,6 @@ struct phy_device {
> /* And management functions */
> const struct phy_driver *drv;
>
> - struct device_link *devlink;
> -
> u32 phyindex;
> u32 phy_id;
>
Powered by blists - more mailing lists