[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1bf4de54e829111e0e4a70e7bd1cf523c9550ff.camel@siemens.com>
Date: Wed, 4 Sep 2024 08:03:09 +0000
From: "Sverdlin, Alexander" <alexander.sverdlin@...mens.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"vladimir.oltean@....com" <vladimir.oltean@....com>
CC: "andrew@...n.ch" <andrew@...n.ch>, "olteanv@...il.com"
<olteanv@...il.com>, "daniel.klauer@....de" <daniel.klauer@....de>,
"davem@...emloft.net" <davem@...emloft.net>, "vivien.didelot@...il.com"
<vivien.didelot@...il.com>, "LinoSanfilippo@....de" <LinoSanfilippo@....de>,
"f.fainelli@...il.com" <f.fainelli@...il.com>, "kuba@...nel.org"
<kuba@...nel.org>, "rafael.richter@....de" <rafael.richter@....de>
Subject: Re: [PATCH net] net: dsa: fix panic when DSA master device unbinds on
shutdown
Hello Vladimir!
On Wed, 2022-02-09 at 14:04 +0200, Vladimir Oltean wrote:
> Rafael reports that on a system with LX2160A and Marvell DSA switches,
> if a reboot occurs while the DSA master (dpaa2-eth) is up, the following
> panic can be seen:
>
> systemd-shutdown[1]: Rebooting.
> Unable to handle kernel paging request at virtual address 00a0000800000041
> [00a0000800000041] address between user and kernel address ranges
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00042-g8f5585009b24 #32
> pc : dsa_slave_netdevice_event+0x130/0x3e4
> lr : raw_notifier_call_chain+0x50/0x6c
> Call trace:
> dsa_slave_netdevice_event+0x130/0x3e4
> raw_notifier_call_chain+0x50/0x6c
> call_netdevice_notifiers_info+0x54/0xa0
> __dev_close_many+0x50/0x130
> dev_close_many+0x84/0x120
> unregister_netdevice_many+0x130/0x710
> unregister_netdevice_queue+0x8c/0xd0
> unregister_netdev+0x20/0x30
> dpaa2_eth_remove+0x68/0x190
> fsl_mc_driver_remove+0x20/0x5c
> __device_release_driver+0x21c/0x220
> device_release_driver_internal+0xac/0xb0
> device_links_unbind_consumers+0xd4/0x100
> __device_release_driver+0x94/0x220
> device_release_driver+0x28/0x40
> bus_remove_device+0x118/0x124
> device_del+0x174/0x420
> fsl_mc_device_remove+0x24/0x40
> __fsl_mc_device_remove+0xc/0x20
> device_for_each_child+0x58/0xa0
> dprc_remove+0x90/0xb0
> fsl_mc_driver_remove+0x20/0x5c
> __device_release_driver+0x21c/0x220
> device_release_driver+0x28/0x40
> bus_remove_device+0x118/0x124
> device_del+0x174/0x420
> fsl_mc_bus_remove+0x80/0x100
> fsl_mc_bus_shutdown+0xc/0x1c
> platform_shutdown+0x20/0x30
> device_shutdown+0x154/0x330
> __do_sys_reboot+0x1cc/0x250
> __arm64_sys_reboot+0x20/0x30
> invoke_syscall.constprop.0+0x4c/0xe0
> do_el0_svc+0x4c/0x150
> el0_svc+0x24/0xb0
> el0t_64_sync_handler+0xa8/0xb0
> el0t_64_sync+0x178/0x17c
>
> It can be seen from the stack trace that the problem is that the
> deregistration of the master causes a dev_close(), which gets notified
> as NETDEV_GOING_DOWN to dsa_slave_netdevice_event().
> But dsa_switch_shutdown() has already run, and this has unregistered the
> DSA slave interfaces, and yet, the NETDEV_GOING_DOWN handler attempts to
> call dev_close_many() on those slave interfaces, leading to the problem.
>
> The previous attempt to avoid the NETDEV_GOING_DOWN on the master after
> dsa_switch_shutdown() was called seems improper. Unregistering the slave
> interfaces is unnecessary and unhelpful. Instead, after the slaves have
> stopped being uppers of the DSA master, we can now reset to NULL the
> master->dsa_ptr pointer, which will make DSA start ignoring all future
> notifier events on the master.
>
> Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
> Reported-by: Rafael Richter <rafael.richter@....de>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> ---
> net/dsa/dsa2.c | 25 ++++++-------------------
> 1 file changed, 6 insertions(+), 19 deletions(-)
>
> diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
> index 909b045c9b11..e498c927c3d0 100644
> --- a/net/dsa/dsa2.c
> +++ b/net/dsa/dsa2.c
> @@ -1784,7 +1784,6 @@ EXPORT_SYMBOL_GPL(dsa_unregister_switch);
> void dsa_switch_shutdown(struct dsa_switch *ds)
> {
> struct net_device *master, *slave_dev;
> - LIST_HEAD(unregister_list);
> struct dsa_port *dp;
>
> mutex_lock(&dsa2_mutex);
> @@ -1795,25 +1794,13 @@ void dsa_switch_shutdown(struct dsa_switch *ds)
> slave_dev = dp->slave;
>
> netdev_upper_dev_unlink(master, slave_dev);
> - /* Just unlinking ourselves as uppers of the master is not
> - * sufficient. When the master net device unregisters, that will
> - * also call dev_close, which we will catch as NETDEV_GOING_DOWN
> - * and trigger a dev_close on our own devices (dsa_slave_close).
> - * In turn, that will call dev_mc_unsync on the master's net
> - * device. If the master is also a DSA switch port, this will
> - * trigger dsa_slave_set_rx_mode which will call dev_mc_sync on
> - * its own master. Lockdep will complain about the fact that
> - * all cascaded masters have the same dsa_master_addr_list_lock_key,
> - * which it normally would not do if the cascaded masters would
> - * be in a proper upper/lower relationship, which we've just
> - * destroyed.
> - * To suppress the lockdep warnings, let's actually unregister
> - * the DSA slave interfaces too, to avoid the nonsensical
> - * multicast address list synchronization on shutdown.
> - */
> - unregister_netdevice_queue(slave_dev, &unregister_list);
> }
> - unregister_netdevice_many(&unregister_list);
> +
> + /* Disconnect from further netdevice notifiers on the master,
> + * since netdev_uses_dsa() will now return false.
> + */
> + dsa_switch_for_each_cpu_port(dp, ds)
> + dp->master->dsa_ptr = NULL;
This is unfortunately racy and leads to other panics:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G O 6.1.99+gitb7793b7d9b35 #1
pc : lan9303_rcv+0x64/0x210
lr : lan9303_rcv+0x148/0x210
Call trace:
lan9303_rcv+0x64/0x210
dsa_switch_rcv+0x1d8/0x350
__netif_receive_skb_list_core+0x1f8/0x220
netif_receive_skb_list_internal+0x18c/0x2a4
napi_gro_receive+0x238/0x254
fec_enet_rx_napi+0x830/0xe60
__napi_poll+0x40/0x210
net_rx_action+0x138/0x2d0
Even though dsa_switch_rcv() checks
if (unlikely(!cpu_dp)) {
kfree_skb(skb);
return 0;
}
if dsa_switch_shutdown() happens to zero dsa_ptr before
dsa_conduit_find_user(dev, 0, port) call, the latter will dereference dsa_ptr==NULL:
static inline struct net_device *dsa_conduit_find_user(struct net_device *dev,
int device, int port)
{
struct dsa_port *cpu_dp = dev->dsa_ptr;
struct dsa_switch_tree *dst = cpu_dp->dst;
I believe there are other race patterns as well if we consider all possible
static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt, struct net_device *unused)
{
struct metadata_dst *md_dst = skb_metadata_dst(skb);
struct dsa_port *cpu_dp = dev->dsa_ptr;
...
nskb = cpu_dp->rcv(skb, dev);
>
> rtnl_unlock();
> mutex_unlock(&dsa2_mutex);
I'm not sure there is a safe way to zero dsa_ptr without ensuring the port
is down and there is no ongoing receive in parallel.
--
Alexander Sverdlin
Siemens AG
www.siemens.com
Powered by blists - more mailing lists