[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <528f797d-445b-a314-d8ef-db15a3b6a8ce@enneenne.com>
Date: Sat, 9 Feb 2019 09:24:48 +0100
From: Rodolfo Giometti <giometti@...eenne.com>
To: Florian Fainelli <f.fainelli@...il.com>,
Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...oirfairelinux.com>
Cc: "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: Possible bug into DSA2 code.
Hello,
I'm working with EPRESSObin and DSA2 where I added the ability to dynamically
load and unload switch configurations by using DT-overlay (a patchwork from here
https://lore.kernel.org/patchwork/patch/468129/). During my tests I notice that
when I remove the overlay in order to disable the switch I got the following BUG
message:
[ 24.862079] ------------[ cut here ]------------
[ 24.866767] kernel BUG at drivers/net/phy/mdio_bus.c:448!
[ 24.872328] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 24.877967] Modules linked in:
[ 24.881109] CPU: 0 PID: 2189 Comm: rmdir Not tainted 4.19.0-sw3720tsn1-038561
[ 24.890509] Hardware name: Kbact sw3720tsn1 smart switch (DT)
[ 24.896426] pstate: 20000005 (nzCv daif -PAN -UAO)
[ 24.901365] pc : mdiobus_unregister+0x90/0x98
[ 24.905838] lr : mv88e6xxx_mdios_unregister+0x64/0x88
[ 24.911028] sp : ffff80006aea7730
[ 24.914434] x29: ffff80006aea7730 x28: ffff80006adc27c0
[ 24.919898] x27: ffff0000091e3000 x26: ffff0000090d9000
[ 24.925365] x25: ffff80006c9420a0 x24: ffff80006c3e1c10
[ 24.930828] x23: 0000000000000060 x22: ffff80006c9e6018
[ 24.936294] x21: ffff80006c9e6110 x20: ffff80006c942800
[ 24.941758] x19: ffff80006c942d40 x18: ffffffffffffffff
[ 24.947225] x17: 0000000000000000 x16: ffff80006adc27c0
[ 24.952690] x15: ffff0000090d96c8 x14: ffff000089198737
[ 24.958156] x13: ffff000009198745 x12: ffff0000090d9940
[ 24.963621] x11: ffff0000086be4b0 x10: 0000000000000040
[ 24.969087] x9 : ffff0000090f4710 x8 : 0000000040000000
[ 24.974553] x7 : ffff0000090d96c8 x6 : ffff80006a97a921
[ 24.980018] x5 : ffff80006a97a920 x4 : 0000000000000fff
[ 24.985483] x3 : 0000000000000000 x2 : 0000000000000000
[ 24.990948] x1 : 0000000000000003 x0 : ffff80006c942800
[ 24.996416] Process rmdir (pid: 2189, stack limit = 0x(____ptrval____))
[ 25.003225] Call trace:
[ 25.005737] mdiobus_unregister+0x90/0x98
[ 25.009858] mv88e6xxx_mdios_unregister+0x64/0x88
[ 25.014696] mv88e6xxx_remove+0x2c/0x88
[ 25.018637] mdio_remove+0x20/0x48
[ 25.022135] device_release_driver_internal+0x1a8/0x240
[ 25.027509] device_release_driver+0x14/0x20
[ 25.031899] bus_remove_device+0x110/0x128
[ 25.036109] device_del+0x124/0x340
[ 25.039693] mdio_device_remove+0x14/0x28
[ 25.043815] mdiobus_unregister+0x50/0x98
[ 25.047940] orion_mdio_remove+0x34/0xb0
[ 25.051970] platform_drv_remove+0x24/0x50
[ 25.056181] device_release_driver_internal+0x1a8/0x240
[ 25.061557] device_release_driver+0x14/0x20
[ 25.065947] bus_remove_device+0x110/0x128
[ 25.070158] device_del+0x124/0x340
[ 25.073742] platform_device_del.part.3+0x24/0x90
[ 25.078580] platform_device_unregister+0x18/0x30
[ 25.083422] of_platform_device_destroy+0xb4/0xb8
[ 25.088257] of_platform_notify+0xa8/0x170
[ 25.092471] notifier_call_chain+0x54/0x98
[ 25.096679] blocking_notifier_call_chain+0x48/0x70
[ 25.101697] of_property_notify+0x60/0xa0
[ 25.105819] __of_changeset_entry_notify+0x54/0x100
[ 25.110836] __of_changeset_revert_notify+0x3c/0x70
[ 25.115857] of_overlay_remove+0x2ac/0x378
[ 25.120066] cfs_overlay_release+0x28/0x50
[ 25.124278] config_item_put.part.0+0x70/0xb0
[ 25.128757] config_item_put+0x10/0x20
[ 25.132609] configfs_rmdir+0x1ec/0x2e0
[ 25.136554] vfs_rmdir+0x7c/0x170
[ 25.139956] do_rmdir+0x17c/0x1d0
[ 25.143361] __arm64_sys_unlinkat+0x4c/0x60
[ 25.147664] el0_svc_common+0x60/0xe8
[ 25.151426] el0_svc_handler+0x2c/0x80
[ 25.155279] el0_svc+0x8/0xc
[ 25.158236] Code: a94153f3 a9425bf5 a8c37bfd d65f03c0 (d4210000)
[ 25.164509] ---[ end trace 5138591d8b9c9222 ]---
After looking into the kernel code I discovered that this depends to the commit
1eb59443e72c69edbb836626f9f7f7e82427eeac which modifications I report below:
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 921a36fd139d..4e0f3c268103 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -312,6 +312,18 @@ static int dsa_ds_apply(struct dsa_switch_tree *dst, struct
dsa_switch *ds)
if (err < 0)
return err;
+ if (!ds->slave_mii_bus && ds->drv->phy_read) {
+ ds->slave_mii_bus = devm_mdiobus_alloc(ds->dev);
+ if (!ds->slave_mii_bus)
+ return -ENOMEM;
+
+ dsa_slave_mii_bus_init(ds);
+
+ err = mdiobus_register(ds->slave_mii_bus);
+ if (err < 0)
+ return err;
+ }
+
for (index = 0; index < DSA_MAX_PORTS; index++) {
port = ds->ports[index].dn;
if (!port)
@@ -361,6 +373,9 @@ static void dsa_ds_unapply(struct dsa_switch_tree *dst,
struct dsa_switch *ds)
dsa_user_port_unapply(port, index, ds);
}
+
+ if (ds->slave_mii_bus && ds->drv->phy_read)
+ mdiobus_unregister(ds->slave_mii_bus);
}
static int dsa_dst_apply(struct dsa_switch_tree *dst)
This patch looks buggy to me because if this patch has the target to catch
drivers that call dsa_ds_apply() having ds->slave_mii_bus set to NULL with a
defined ds->ops->phy_read, then it should take into account also those drivers
that have both ds->slave_mii_bus and ds->ops->phy_read already defined and then
DO NOT call mdiobus_unregister() during dsa_ds_unapply()! This because DSA
should NOT undo an operation it never did.
So we I see two possible solutions:
1) having both ds->slave_mii_bus and ds->ops->phy_read already defined is an
error, then it must be signaled to the calling code, or
2) we have to use a flag to signal dsa_ds_unapply() what to do.
I don't know DSA too much to provide the rigth-thing(TM) so I'm waiting for a
reply before proposing a patch. :-)
Ciao,
Rodolfo
--
GNU/Linux Solutions e-mail: giometti@...eenne.com
Linux Device Driver giometti@...ux.it
Embedded Systems phone: +39 349 2432127
UNIX programming skype: rodolfo.giometti
Powered by blists - more mailing lists