[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d696a426-40bb-4c1a-b42d-990fb690de5e@quicinc.com>
Date: Tue, 3 Jun 2025 13:39:47 -0700
From: "Abhishek Chauhan (ABC)" <quic_abchauha@...cinc.com>
To: Florian Fainelli <f.fainelli@...il.com>, Wei Fang <wei.fang@....com>,
<andrew@...n.ch>, <hkallweit1@...il.com>, <linux@...linux.org.uk>,
<davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<pabeni@...hat.com>, <xiaolei.wang@...driver.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<imx@...ts.linux.dev>
Subject: Re: [PATCH v2 net] net: phy: clear phydev->devlink when the link is
deleted
On 5/23/2025 8:19 AM, Florian Fainelli wrote:
>
>
> On 5/23/2025 1:37 AM, Wei Fang wrote:
>> There is a potential crash issue when disabling and re-enabling the
>> network port. When disabling the network port, phy_detach() calls
>> device_link_del() to remove the device link, but it does not clear
>> phydev->devlink, so phydev->devlink is not a NULL pointer. Then the
>> network port is re-enabled, but if phy_attach_direct() fails before
>> calling device_link_add(), the code jumps to the "error" label and
>> calls phy_detach(). Since phydev->devlink retains the old value from
>> the previous attach/detach cycle, device_link_del() uses the old value,
>> which accesses a NULL pointer and causes a crash. The simplified crash
>> log is as follows.
>>
>> [ 24.702421] Call trace:
>> [ 24.704856] device_link_put_kref+0x20/0x120
>> [ 24.709124] device_link_del+0x30/0x48
>> [ 24.712864] phy_detach+0x24/0x168
>> [ 24.716261] phy_attach_direct+0x168/0x3a4
>> [ 24.720352] phylink_fwnode_phy_connect+0xc8/0x14c
>> [ 24.725140] phylink_of_phy_connect+0x1c/0x34
>>
>> Therefore, phydev->devlink needs to be cleared when the device link is
>> deleted.
>>
>> Fixes: bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev")
>> Signed-off-by: Wei Fang <wei.fang@....com>
>
@Wei
What happens in case of shared mdio ?
1. Device 23040000 has the mdio node of both the ethernet phy and device 23000000 references the phy-handle present in the Device 23040000
2. When rmmod of the driver happens
3. the parent devlink is already deleted.
4. This cause the child mdio to access an entry causing a corruption.
5. Thought this fix would help but i see that its not helping the case.
Wondering if this is a legacy issue with shared mdio framework.
43369.232799: qcom-ethqos 23040000.ethernet eth1: stmmac_dvr_remove: removing driver
43369.233782: stmmac_pcs: Link Down
43369.258337: qcom-ethqos 23040000.ethernet eth1: FPE workqueue stop
43369.258522: br1: port 1(eth1) entered disabled state
43369.758779: qcom-ethqos 23040000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
43369.758789: qcom-ethqos 23040000.ethernet eth1: failed to kill vid 0081/0
43369.759270: qcom-ethqos 23040000.ethernet eth1 (unregistering): left allmulticast mode
43369.759275: qcom-ethqos 23040000.ethernet eth1 (unregistering): left promiscuous mode
43369.759301: br1: port 1(eth1) entered disabled state
43370.259618: qcom-ethqos 23040000.ethernet eth1 (unregistering): Timeout accessing MAC_VLAN_Tag_Filter
43370.309863: qcom-ethqos 23000000.ethernet eth0: stmmac_dvr_remove: removing driver
43370.310019: list_del corruption, ffffff80c6ec9408->prev is LIST_POISON2 (dead000000000122)
43370.310034: ------------[ cut here ]------------
43370.310035: kernel BUG at lib/list_debug.c:59!
43370.310119: CPU: 4 PID: 3067767 Comm: rmmod Tainted: G W OE 6.6.65-rt47-debug #1
43370.310122: Hardware name: Qualcomm Technologies, Inc. SA8775P Ride (DT)
43370.310165: Call trace:
43370.310166: __list_del_entry_valid_or_report+0xa8/0xe0
43370.310168: __device_link_del+0x40/0xf0
43370.310172: device_link_put_kref+0xb4/0xc8
43370.310174: device_link_del+0x38/0x58
43370.310176: phy_detach+0x2c/0x170
43370.310181: phy_disconnect+0x4c/0x70
43370.310184: phylink_disconnect_phy+0x6c/0xc0 [phylink]
43370.310194: stmmac_release+0x60/0x358 [stmmac]
43370.310210: __dev_close_many+0xb4/0x160
43370.310213: dev_close_many+0xbc/0x1a0
43370.310215: unregister_netdevice_many_notify+0x178/0x870
43370.310218: unregister_netdevice_queue+0xf8/0x140
43370.310221: unregister_netdev+0x2c/0x48
43370.310223: stmmac_dvr_remove+0xd0/0x1b0 [stmmac]
43370.310233: devm_stmmac_pltfr_remove+0x2c/0x58 [stmmac_platform]
> Reviewed-by: Florian Fainelli <florian.fainelli@...adcom.com>
Powered by blists - more mailing lists