[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6d6cc153-85ca-62c8-8d9c-4f4c6a325e91@gmail.com>
Date: Tue, 12 Jan 2021 10:51:39 -0800
From: Florian Fainelli <f.fainelli@...il.com>
To: Vladimir Oltean <olteanv@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org
Cc: andrew@...n.ch, vivien.didelot@...il.com
Subject: Re: [PATCH net] net: dsa: unbind all switches from tree when DSA
master unbinds
On 1/11/21 3:09 PM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@....com>
>
> Currently the following happens when a DSA master driver unbinds while
> there are DSA switches attached to it:
>
> $ echo 0000:00:00.5 > /sys/bus/pci/drivers/mscc_felix/unbind
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 392 at net/core/dev.c:9507
> Call trace:
> rollback_registered_many+0x5fc/0x688
> unregister_netdevice_queue+0x98/0x120
> dsa_slave_destroy+0x4c/0x88
> dsa_port_teardown.part.16+0x78/0xb0
> dsa_tree_teardown_switches+0x58/0xc0
> dsa_unregister_switch+0x104/0x1b8
> felix_pci_remove+0x24/0x48
> pci_device_remove+0x48/0xf0
> device_release_driver_internal+0x118/0x1e8
> device_driver_detach+0x28/0x38
> unbind_store+0xd0/0x100
>
> Located at the above location is this WARN_ON:
>
> /* Notifier chain MUST detach us all upper devices. */
> WARN_ON(netdev_has_any_upper_dev(dev));
>
> Other stacked interfaces, like VLAN, do indeed listen for
> NETDEV_UNREGISTER on the real_dev and also unregister themselves at that
> time, which is clearly the behavior that rollback_registered_many
> expects. But DSA interfaces are not VLAN. They have backing hardware
> (platform devices, PCI devices, MDIO, SPI etc) which have a life cycle
> of their own and we can't just trigger an unregister from the DSA
> framework when we receive a netdev notifier that the master unregisters.
>
> Luckily, there is something we can do, and that is to inform the driver
> core that we have a runtime dependency to the DSA master interface's
> device, and create a device link where that is the supplier and we are
> the consumer. Having this device link will make the DSA switch unbind
> before the DSA master unbinds, which is enough to avoid the WARN_ON from
> rollback_registered_many.
>
> Note that even before the blamed commit, DSA did nothing intelligent
> when the master interface got unregistered either. See the discussion
> here:
> https://lore.kernel.org/netdev/20200505210253.20311-1-f.fainelli@gmail.com/
> But this time, at least the WARN_ON is loud enough that the
> upper_dev_link commit can be blamed.
>
> The advantage with this approach vs dev_hold(master) in the attached
> link is that the latter is not meant for long term reference counting.
> With dev_hold, the only thing that will happen is that when the user
> attempts an unbind of the DSA master, netdev_wait_allrefs will keep
> waiting and waiting, due to DSA keeping the refcount forever. DSA would
> not access freed memory corresponding to the master interface, but the
> unbind would still result in a freeze. Whereas with device links,
> graceful teardown is ensured. It even works with cascaded DSA trees.
>
> $ echo 0000:00:00.2 > /sys/bus/pci/drivers/fsl_enetc/unbind
> [ 1818.797546] device swp0 left promiscuous mode
> [ 1819.301112] sja1105 spi2.0: Link is Down
> [ 1819.307981] DSA: tree 1 torn down
> [ 1819.312408] device eno2 left promiscuous mode
> [ 1819.656803] mscc_felix 0000:00:00.5: Link is Down
> [ 1819.667194] DSA: tree 0 torn down
> [ 1819.711557] fsl_enetc 0000:00:00.2 eno2: Link is Down
>
> This approach allows us to keep the DSA framework absolutely unchanged,
> and the driver core will just know to unbind us first when the master
> goes away - as opposed to the large (and probably impossible) rework
> required if attempting to listen for NETDEV_UNREGISTER.
>
> As per the documentation at Documentation/driver-api/device_link.rst,
> specifying the DL_FLAG_AUTOREMOVE_CONSUMER flag causes the device link
> to be automatically purged when the consumer fails to probe or later
> unbinds. So we don't need to keep the consumer_link variable in struct
> dsa_switch.
>
> Fixes: 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
Reviewed-by: Florian Fainelli <f.fainelli@...il.com>
Tested-by: Florian Fainelli <f.fainelli@...il.com>
Thanks!
--
Florian
Powered by blists - more mailing lists