lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250912034538.1406132-1-zhangjian.3032@bytedance.com>
Date: Fri, 12 Sep 2025 11:45:38 +0800
From: Jian Zhang <zhangjian.3032@...edance.com>
To: netdev@...r.kernel.org,
	davem@...emloft.net,
	andrew+netdev@...n.ch,
	guoheyi@...ux.alibaba.com
Cc: Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>,
	Paolo Abeni <pabeni@...hat.com>,
	Jacky Chou <jacky_chou@...eedtech.com>,
	Simon Horman <horms@...nel.org>,
	Heiner Kallweit <hkallweit1@...il.com>,
	Jian Zhang <zhangjian.3032@...edance.com>,
	Uwe Kleine-König <u.kleine-koenig@...libre.com>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	linux-kernel@...r.kernel.org
Subject: [PATCH 1/1] Revert "drivers/net/ftgmac100: fix DHCP potential failure with systemd"

This reverts commit 1baf2e50e48f10f0ea07d53e13381fd0da1546d2.

This patch can trigger a hung task when:
* rtnetlink is setting the link down
* the PHY state_queue is triggered and calls ftgmac100_adjust_link

Within the rtnetlink flow, `cancel_delayed_work_sync` is called while
holding `rtnl_lock`. This function cancels or waits for a delay work
item to complete. If the PHY state_queue (delay work) is simultaneously
executing `adjust_link`, it will eventually call `rtnl_lock` again,
causing a deadlock.

This results in the following (partial) trace:
* rtnetlink (do_setlink):
[  243.326104] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.334871] task:systemd-network state:D stack:0     pid:711   ppid:1      flags:0x0000080d
[  243.344233] Call trace:
[  243.346986]  __switch_to+0xac/0xd8
[  243.350814]  __schedule+0x3c0/0xb78
[  243.354734]  schedule+0x60/0xc8
[  243.358258]  schedule_timeout+0x188/0x230
[  243.362762]  wait_for_completion+0x7c/0x168
[  243.367461]  __flush_work+0x29c/0x4c8
[  243.371579]  __cancel_work_timer+0x130/0x1b8
[  243.376376]  cancel_delayed_work_sync+0x18/0x28
[  243.381463]  phy_stop+0x7c/0x170
[  243.385098]  ftgmac100_stop+0x78/0xf0
[  243.389213]  __dev_close_many+0xb4/0x160
[  243.393621]  __dev_change_flags+0xfc/0x250
[  243.398226]  dev_change_flags+0x28/0x78
[  243.402536]  do_setlink+0x258/0xdb0
[  243.406460]  rtnl_setlink+0xf0/0x1b8
[  243.410484]  rtnetlink_rcv_msg+0x2a0/0x768
[  243.415097]  netlink_rcv_skb+0x64/0x138
[  243.419473]  rtnetlink_rcv+0x1c/0x30
[  243.423540]  netlink_unicast+0x1c8/0x2a8
[  243.427973]  netlink_sendmsg+0x1c4/0x438
[  243.432402]  __sys_sendto+0xe4/0x178
[  243.436447]  __arm64_sys_sendto+0x2c/0x40
[  243.440966]  invoke_syscall.constprop.0+0x60/0x108
[  243.446397]  do_el0_svc+0xa4/0xc8
[  243.450171]  el0_svc+0x48/0x118
[  243.453710]  el0t_64_sync_handler+0x118/0x128
[  243.458648]  el0t_64_sync+0x14c/0x150

* state_queue (phy_state_machine):
[  242.882453] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.891226] task:kworker/3:0     state:D stack:0     pid:32    ppid:2      flags:0x00000008
[  242.900592] Workqueue: events_power_efficient phy_state_machine
[  242.907250] Call trace:
[  242.910001]  __switch_to+0xac/0xd8
[  242.913813]  __schedule+0x3c0/0xb78
[  242.917735]  schedule+0x60/0xc8
[  242.921268]  schedule_preempt_disabled+0x28/0x48
[  242.926449]  __mutex_lock+0x1cc/0x400
[  242.930565]  mutex_lock_nested+0x28/0x38
[  242.934971]  rtnl_lock+0x60/0x70
[  242.938607]  ftgmac100_reset+0x34/0x248
[  242.942919]  ftgmac100_adjust_link+0xe0/0x150
[  242.947816]  phy_link_change+0x34/0x68
[  242.952032]  phy_check_link_status+0x8c/0xf8
[  242.956829]  phy_state_machine+0x16c/0x2e0
[  242.961428]  process_one_work+0x258/0x620
[  242.965934]  worker_thread+0x1e8/0x3e0
[  242.970148]  kthread+0x114/0x120
[  242.973762]  ret_from_fork+0x10/0x20

Signed-off-by: Jian Zhang <zhangjian.3032@...edance.com>
---
 drivers/net/ethernet/faraday/ftgmac100.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
index a863f7841210..477719a518bc 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -1448,17 +1448,8 @@ static void ftgmac100_adjust_link(struct net_device *netdev)
 	/* Disable all interrupts */
 	iowrite32(0, priv->base + FTGMAC100_OFFSET_IER);
 
-	/* Release phy lock to allow ftgmac100_reset to acquire it, keeping lock
-	 * order consistent to prevent dead lock.
-	 */
-	if (netdev->phydev)
-		mutex_unlock(&netdev->phydev->lock);
-
-	ftgmac100_reset(priv);
-
-	if (netdev->phydev)
-		mutex_lock(&netdev->phydev->lock);
-
+	/* Reset the adapter asynchronously */
+	schedule_work(&priv->reset_task);
 }
 
 static int ftgmac100_mii_probe(struct net_device *netdev)
-- 
2.47.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ