[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <2b1dc053-8c9a-e3e4-b450-eecdfca3fe16@gmail.com>
Date: Thu, 30 Sep 2021 11:58:21 +0200
From: Rafał Miłecki <zajec5@...il.com>
To: Andrew Lunn <andrew@...n.ch>,
Heiner Kallweit <hkallweit1@...il.com>,
Russell King <linux@...linux.org.uk>,
Network Development <netdev@...r.kernel.org>
Cc: Florian Fainelli <f.fainelli@...il.com>,
BCM Kernel Feedback <bcm-kernel-feedback-list@...adcom.com>,
Vivek Unune <npcomplete13@...il.com>
Subject: Lockup in phy_probe() for MDIO device (Broadcom's switch)
Hi,
I've just received a report of kernel lockup after switching OpenWrt
platform from kernel 5.4 to kernel 5.10:
https://bugs.openwrt.org/index.php?do=details&task_id=4055
The problem is phy_probe() and its:
mutex_lock(&phydev->lock);
It seems to me that "lock" mutex doesn't get initalized. It seems
phy_device_create() doesn't get called for an MDIO device.
This isn't necessarily a PHY / MDIO regression. It could be some core
change that exposed a PHY / MDIO bug.
*** Lockup ***
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 5.10.64 (rmilecki@...alhost.localdomain) (arm-openwrt-linux-muslgnueabi-gcc (OpenWrt GCC 11.2.0 r17558+1-71e96532df) 11.2.0, GNU ld (GNU Binutils) 2.36.1) #0 SMP Wed Sep 29 20:08:07 2021
(...)
[ 5.592447] libphy: Fixed MDIO Bus: probed
[ 5.596809] [of_mdiobus_register:254] np:/mdio@...03000
[ 5.602333] libphy: iProc MDIO bus: probed
[ 5.606479] iproc-mdio 18003000.mdio: Broadcom iProc MDIO bus registered
[ 5.613439] [of_mdiobus_register:254] np:/mdio-mux@...03000/mdio@0
[ 5.620101] libphy: mdio_mux: probed
[ 5.623709] [of_mdiobus_register:282] child:/mdio-mux@...03000/mdio@...sb3-phy@10
[ 5.631571] [of_mdiobus_register:254] np:/mdio-mux@...03000/mdio@200
[ 5.638426] libphy: mdio_mux: probed
[ 5.642032] [of_mdiobus_register:282] child:/mdio-mux@...03000/mdio@.../switch@0
[ 5.649841] ------------[ cut here ]------------
[ 5.654503] WARNING: CPU: 0 PID: 1 at drivers/net/phy/phy_device.c:2839 phy_probe+0x58/0x1e8
[ 5.662983] Modules linked in:
[ 5.666055] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.64 #0
[ 5.672074] Hardware name: BCM5301X
[ 5.675587] [<c0108410>] (unwind_backtrace) from [<c0104bc4>] (show_stack+0x10/0x14)
[ 5.683359] [<c0104bc4>] (show_stack) from [<c03dbfc8>] (dump_stack+0x94/0xa8)
[ 5.690609] [<c03dbfc8>] (dump_stack) from [<c01183e4>] (__warn+0xb8/0x114)
[ 5.697591] [<c01183e4>] (__warn) from [<c01184a8>] (warn_slowpath_fmt+0x68/0x78)
[ 5.705095] [<c01184a8>] (warn_slowpath_fmt) from [<c04b85d0>] (phy_probe+0x58/0x1e8)
[ 5.712951] [<c04b85d0>] (phy_probe) from [<c04569f8>] (really_probe+0xfc/0x4e0)
[ 5.720361] [<c04569f8>] (really_probe) from [<c0454c50>] (bus_for_each_drv+0x74/0x98)
[ 5.728298] [<c0454c50>] (bus_for_each_drv) from [<c0456f90>] (__device_attach+0xcc/0x120)
[ 5.736584] [<c0456f90>] (__device_attach) from [<c0455bd8>] (bus_probe_device+0x84/0x8c)
[ 5.744782] [<c0455bd8>] (bus_probe_device) from [<c0452284>] (device_add+0x300/0x77c)
[ 5.752724] [<c0452284>] (device_add) from [<c04b9c4c>] (mdio_device_register+0x24/0x48)
[ 5.760836] [<c04b9c4c>] (mdio_device_register) from [<c04c15d4>] (of_mdiobus_register+0x1f8/0x330)
[ 5.769904] [<c04c15d4>] (of_mdiobus_register) from [<c04c1c1c>] (mdio_mux_init+0x178/0x2c0)
[ 5.778363] [<c04c1c1c>] (mdio_mux_init) from [<c04c1ef8>] (mdio_mux_mmioreg_probe+0x138/0x1fc)
[ 5.787089] [<c04c1ef8>] (mdio_mux_mmioreg_probe) from [<c04587bc>] (platform_drv_probe+0x34/0x70)
[ 5.796066] [<c04587bc>] (platform_drv_probe) from [<c04569f8>] (really_probe+0xfc/0x4e0)
[ 5.804266] [<c04569f8>] (really_probe) from [<c04573dc>] (device_driver_attach+0xe4/0xf4)
[ 5.812552] [<c04573dc>] (device_driver_attach) from [<c0457468>] (__driver_attach+0x7c/0x110)
[ 5.821186] [<c0457468>] (__driver_attach) from [<c0454bb0>] (bus_for_each_dev+0x64/0x90)
[ 5.829385] [<c0454bb0>] (bus_for_each_dev) from [<c0455dd0>] (bus_add_driver+0xf8/0x1e0)
[ 5.837585] [<c0455dd0>] (bus_add_driver) from [<c0457a74>] (driver_register+0x88/0x118)
[ 5.845697] [<c0457a74>] (driver_register) from [<c01017e4>] (do_one_initcall+0x54/0x1e8)
[ 5.853907] [<c01017e4>] (do_one_initcall) from [<c0801118>] (kernel_init_freeable+0x23c/0x290)
[ 5.862628] [<c0801118>] (kernel_init_freeable) from [<c065a550>] (kernel_init+0x8/0x118)
[ 5.870826] [<c065a550>] (kernel_init) from [<c0100128>] (ret_from_fork+0x14/0x2c)
[ 5.878413] Exception stack(0xc1035fb0 to 0xc1035ff8)
[ 5.883470] 5fa0: 00000000 00000000 00000000 00000000
[ 5.891662] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 5.899852] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 5.906509] ---[ end trace 6a8fa3807352bffb ]---
[ 5.911144] Broadcom B53 (2) 0.200:00: [phy_probe:2840] TAKING LOCK...
[ 26.924625] rcu: INFO: rcu_sched self-detected stall on CPU
[ 26.930213] rcu: 0-....: (2099 ticks this GP) idle=e3e/1/0x40000002 softirq=109/109 fqs=1050
[ 26.938844] (t=2100 jiffies g=-1111 q=287)
[ 26.943031] NMI backtrace for cpu 0
[ 26.946523] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.10.64 #0
[ 26.953934] Hardware name: BCM5301X
[ 26.957437] [<c0108410>] (unwind_backtrace) from [<c0104bc4>] (show_stack+0x10/0x14)
[ 26.965206] [<c0104bc4>] (show_stack) from [<c03dbfc8>] (dump_stack+0x94/0xa8)
[ 26.972450] [<c03dbfc8>] (dump_stack) from [<c03e3890>] (nmi_cpu_backtrace+0xc8/0xf4)
[ 26.980295] [<c03e3890>] (nmi_cpu_backtrace) from [<c03e39c0>] (nmi_trigger_cpumask_backtrace+0x104/0x13c)
[ 26.989974] [<c03e39c0>] (nmi_trigger_cpumask_backtrace) from [<c01700a0>] (rcu_dump_cpu_stacks+0xe4/0x10c)
[ 26.999743] [<c01700a0>] (rcu_dump_cpu_stacks) from [<c0175444>] (rcu_sched_clock_irq+0x6e4/0x8ac)
[ 27.008731] [<c0175444>] (rcu_sched_clock_irq) from [<c017b47c>] (update_process_times+0x88/0xbc)
[ 27.017632] [<c017b47c>] (update_process_times) from [<c018ac8c>] (tick_sched_timer+0x78/0x274)
[ 27.026349] [<c018ac8c>] (tick_sched_timer) from [<c017b9e8>] (__hrtimer_run_queues+0x15c/0x218)
[ 27.035157] [<c017b9e8>] (__hrtimer_run_queues) from [<c017c5c8>] (hrtimer_interrupt+0x11c/0x298)
[ 27.044056] [<c017c5c8>] (hrtimer_interrupt) from [<c0107a30>] (twd_handler+0x34/0x3c)
[ 27.051993] [<c0107a30>] (twd_handler) from [<c0167c68>] (handle_percpu_devid_irq+0x78/0x148)
[ 27.060547] [<c0167c68>] (handle_percpu_devid_irq) from [<c0162470>] (__handle_domain_irq+0x84/0xd8)
[ 27.069706] [<c0162470>] (__handle_domain_irq) from [<c03f48e8>] (gic_handle_irq+0x80/0x94)
[ 27.078076] [<c03f48e8>] (gic_handle_irq) from [<c0100aec>] (__irq_svc+0x6c/0x90)
[ 27.085575] Exception stack(0xc1035c28 to 0xc1035c70)
[ 27.090634] 5c20: c116648c 00000000 0000c116 00006488 c1166488 ffffe000
[ 27.098825] 5c40: 00000000 c1034000 00000002 c0982be8 c116648c 00000000 00000000 c1035c78
[ 27.107022] 5c60: c065ce58 c065fe64 80000013 ffffffff
[ 27.112086] [<c0100aec>] (__irq_svc) from [<c065fe64>] (_raw_spin_lock+0x2c/0x40)
[ 27.119585] [<c065fe64>] (_raw_spin_lock) from [<c065ce58>] (__mutex_lock.constprop.0+0x1b8/0x520)
[ 27.128571] [<c065ce58>] (__mutex_lock.constprop.0) from [<c04b85f4>] (phy_probe+0x7c/0x1e8)
[ 27.137032] [<c04b85f4>] (phy_probe) from [<c04569f8>] (really_probe+0xfc/0x4e0)
[ 27.144444] [<c04569f8>] (really_probe) from [<c0454c50>] (bus_for_each_drv+0x74/0x98)
[ 27.152380] [<c0454c50>] (bus_for_each_drv) from [<c0456f90>] (__device_attach+0xcc/0x120)
[ 27.160667] [<c0456f90>] (__device_attach) from [<c0455bd8>] (bus_probe_device+0x84/0x8c)
[ 27.168865] [<c0455bd8>] (bus_probe_device) from [<c0452284>] (device_add+0x300/0x77c)
[ 27.176797] [<c0452284>] (device_add) from [<c04b9c4c>] (mdio_device_register+0x24/0x48)
[ 27.184911] [<c04b9c4c>] (mdio_device_register) from [<c04c15d4>] (of_mdiobus_register+0x1f8/0x330)
[ 27.193977] [<c04c15d4>] (of_mdiobus_register) from [<c04c1c1c>] (mdio_mux_init+0x178/0x2c0)
[ 27.202437] [<c04c1c1c>] (mdio_mux_init) from [<c04c1ef8>] (mdio_mux_mmioreg_probe+0x138/0x1fc)
[ 27.211154] [<c04c1ef8>] (mdio_mux_mmioreg_probe) from [<c04587bc>] (platform_drv_probe+0x34/0x70)
[ 27.220133] [<c04587bc>] (platform_drv_probe) from [<c04569f8>] (really_probe+0xfc/0x4e0)
[ 27.228331] [<c04569f8>] (really_probe) from [<c04573dc>] (device_driver_attach+0xe4/0xf4)
[ 27.236609] [<c04573dc>] (device_driver_attach) from [<c0457468>] (__driver_attach+0x7c/0x110)
[ 27.245243] [<c0457468>] (__driver_attach) from [<c0454bb0>] (bus_for_each_dev+0x64/0x90)
[ 27.253434] [<c0454bb0>] (bus_for_each_dev) from [<c0455dd0>] (bus_add_driver+0xf8/0x1e0)
[ 27.261633] [<c0455dd0>] (bus_add_driver) from [<c0457a74>] (driver_register+0x88/0x118)
[ 27.269746] [<c0457a74>] (driver_register) from [<c01017e4>] (do_one_initcall+0x54/0x1e8)
[ 27.277949] [<c01017e4>] (do_one_initcall) from [<c0801118>] (kernel_init_freeable+0x23c/0x290)
[ 27.286667] [<c0801118>] (kernel_init_freeable) from [<c065a550>] (kernel_init+0x8/0x118)
[ 27.294865] [<c065a550>] (kernel_init) from [<c0100128>] (ret_from_fork+0x14/0x2c)
[ 27.302452] Exception stack(0xc1035fb0 to 0xc1035ff8)
[ 27.307511] 5fa0: 00000000 00000000 00000000 00000000
[ 27.315702] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 27.323891] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
*** Device tree ***
Base: arch/arm/boot/dts/bcm5301x.dtsi
Relevant part:
mdio-bus-mux@...03000 {
compatible = "mdio-mux-mmioreg";
mdio-parent-bus = <&mdio>;
#address-cells = <1>;
#size-cells = <0>;
reg = <0x18003000 0x4>;
mux-mask = <0x200>;
mdio@0 {
reg = <0x0>;
#address-cells = <1>;
#size-cells = <0>;
usb3_phy: usb3-phy@10 {
compatible = "brcm,ns-ax-usb3-phy";
reg = <0x10>;
usb3-dmp-syscon = <&usb3_dmp>;
#phy-cells = <0>;
status = "disabled";
};
};
mdio@200 {
reg = <0x200>;
#address-cells = <1>;
#size-cells = <0>;
switch@0 {
compatible = "brcm,bcm53125";
#address-cells = <1>;
#size-cells = <0>;
reset-gpios = <&chipcommon 10 GPIO_ACTIVE_LOW>;
reset-names = "robo_reset";
reg = <0>;
dsa,member = <1 0>;
pinctrl-names = "default";
pinctrl-0 = <&pinmux_mdio>;
ports {
#address-cells = <1>;
#size-cells = <0>;
port@0 {
reg = <0>;
label = "lan1";
};
port@1 {
reg = <1>;
label = "lan5";
};
port@2 {
reg = <2>;
label = "lan2";
};
port@3 {
reg = <3>;
label = "lan6";
};
port@4 {
reg = <4>;
label = "lan3";
};
};
};
};
};
*** Used debugging diff ***
diff --git a/drivers/net/mdio/of_mdio.c b/drivers/net/mdio/of_mdio.c
index 4daf94bb5..dde775c92 100644
--- a/drivers/net/mdio/of_mdio.c
+++ b/drivers/net/mdio/of_mdio.c
@@ -251,6 +251,7 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
bool scanphys = false;
int addr, rc;
+pr_info("[%s:%d] np:%pOF\n", __func__, __LINE__, np);
if (!np)
return mdiobus_register(mdio);
@@ -278,6 +279,7 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
/* Loop over the child nodes and register a phy_device for each phy */
for_each_available_child_of_node(np, child) {
+pr_info("[%s:%d] child:%pOF\n", __func__, __LINE__, child);
addr = of_mdio_parse_addr(&mdio->dev, child);
if (addr < 0) {
scanphys = true;
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 950277e4d..a0a46af82 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -592,6 +592,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id,
dev->state = PHY_DOWN;
+dev_info(&mdiodev->dev, "[%s:%d] INIT MUTEX\n", __func__, __LINE__);
mutex_init(&dev->lock);
INIT_DELAYED_WORK(&dev->state_queue, phy_state_machine);
@@ -2835,7 +2836,10 @@ static int phy_probe(struct device *dev)
if (phydrv->flags & PHY_IS_INTERNAL)
phydev->is_internal = true;
+WARN_ON(1);
+dev_info(dev, "[%s:%d] TAKING LOCK...\n", __func__, __LINE__);
mutex_lock(&phydev->lock);
+dev_info(dev, "[%s:%d] LOCKED\n", __func__, __LINE__);
/* Deassert the reset signal */
phy_device_reset(phydev, 0);
Powered by blists - more mailing lists