lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d696a426-40bb-4c1a-b42d-990fb690de5e@quicinc.com>
Date: Tue, 3 Jun 2025 13:39:47 -0700
From: "Abhishek Chauhan (ABC)" <quic_abchauha@...cinc.com>
To: Florian Fainelli <f.fainelli@...il.com>, Wei Fang <wei.fang@....com>,
        <andrew@...n.ch>, <hkallweit1@...il.com>, <linux@...linux.org.uk>,
        <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
        <pabeni@...hat.com>, <xiaolei.wang@...driver.com>
CC: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <imx@...ts.linux.dev>
Subject: Re: [PATCH v2 net] net: phy: clear phydev->devlink when the link is
 deleted



On 5/23/2025 8:19 AM, Florian Fainelli wrote:
> 
> 
> On 5/23/2025 1:37 AM, Wei Fang wrote:
>> There is a potential crash issue when disabling and re-enabling the
>> network port. When disabling the network port, phy_detach() calls
>> device_link_del() to remove the device link, but it does not clear
>> phydev->devlink, so phydev->devlink is not a NULL pointer. Then the
>> network port is re-enabled, but if phy_attach_direct() fails before
>> calling device_link_add(), the code jumps to the "error" label and
>> calls phy_detach(). Since phydev->devlink retains the old value from
>> the previous attach/detach cycle, device_link_del() uses the old value,
>> which accesses a NULL pointer and causes a crash. The simplified crash
>> log is as follows.
>>
>> [   24.702421] Call trace:
>> [   24.704856]  device_link_put_kref+0x20/0x120
>> [   24.709124]  device_link_del+0x30/0x48
>> [   24.712864]  phy_detach+0x24/0x168
>> [   24.716261]  phy_attach_direct+0x168/0x3a4
>> [   24.720352]  phylink_fwnode_phy_connect+0xc8/0x14c
>> [   24.725140]  phylink_of_phy_connect+0x1c/0x34
>>
>> Therefore, phydev->devlink needs to be cleared when the device link is
>> deleted.
>>
>> Fixes: bc66fa87d4fd ("net: phy: Add link between phy dev and mac dev")
>> Signed-off-by: Wei Fang <wei.fang@....com>
> 
@Wei 
What happens in case of shared mdio ? 

1. Device 23040000 has the mdio node of both the ethernet phy and device 23000000 references the phy-handle present in the Device 23040000
2. When rmmod of the driver happens 
3. the parent devlink is already deleted. 
4. This cause the child mdio to access an entry causing a corruption. 
5. Thought this fix would help but i see that its not helping the case. 

Wondering if this is a legacy issue with shared mdio framework. 


 43369.232799:   qcom-ethqos 23040000.ethernet eth1: stmmac_dvr_remove: removing driver
 43369.233782:   stmmac_pcs: Link Down
 43369.258337:   qcom-ethqos 23040000.ethernet eth1: FPE workqueue stop
 43369.258522:   br1: port 1(eth1) entered disabled state
 43369.758779:   qcom-ethqos 23040000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
 43369.758789:   qcom-ethqos 23040000.ethernet eth1: failed to kill vid 0081/0
 43369.759270:   qcom-ethqos 23040000.ethernet eth1 (unregistering): left allmulticast mode
 43369.759275:   qcom-ethqos 23040000.ethernet eth1 (unregistering): left promiscuous mode
 43369.759301:   br1: port 1(eth1) entered disabled state
 43370.259618:   qcom-ethqos 23040000.ethernet eth1 (unregistering): Timeout accessing MAC_VLAN_Tag_Filter
 43370.309863:   qcom-ethqos 23000000.ethernet eth0: stmmac_dvr_remove: removing driver
 43370.310019:   list_del corruption, ffffff80c6ec9408->prev is LIST_POISON2 (dead000000000122)
 43370.310034:   ------------[ cut here ]------------
 43370.310035:   kernel BUG at lib/list_debug.c:59!
 43370.310119:   CPU: 4 PID: 3067767 Comm: rmmod Tainted: G        W  OE      6.6.65-rt47-debug #1
 43370.310122:   Hardware name: Qualcomm Technologies, Inc. SA8775P Ride (DT)
 43370.310165:   Call trace:
 43370.310166:    __list_del_entry_valid_or_report+0xa8/0xe0
 43370.310168:    __device_link_del+0x40/0xf0
 43370.310172:    device_link_put_kref+0xb4/0xc8
 43370.310174:    device_link_del+0x38/0x58
 43370.310176:    phy_detach+0x2c/0x170
 43370.310181:    phy_disconnect+0x4c/0x70
 43370.310184:    phylink_disconnect_phy+0x6c/0xc0 [phylink]
 43370.310194:    stmmac_release+0x60/0x358 [stmmac]
 43370.310210:    __dev_close_many+0xb4/0x160
 43370.310213:    dev_close_many+0xbc/0x1a0
 43370.310215:    unregister_netdevice_many_notify+0x178/0x870
 43370.310218:    unregister_netdevice_queue+0xf8/0x140
 43370.310221:    unregister_netdev+0x2c/0x48
 43370.310223:    stmmac_dvr_remove+0xd0/0x1b0 [stmmac]
 43370.310233:    devm_stmmac_pltfr_remove+0x2c/0x58 [stmmac_platform]


> Reviewed-by: Florian Fainelli <florian.fainelli@...adcom.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ