netdev - Re: [PATCH net] net: dsa: fix panic when DSA master device unbinds on shutdown

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1bf4de54e829111e0e4a70e7bd1cf523c9550ff.camel@siemens.com>
Date: Wed, 4 Sep 2024 08:03:09 +0000
From: "Sverdlin, Alexander" <alexander.sverdlin@...mens.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"vladimir.oltean@....com" <vladimir.oltean@....com>
CC: "andrew@...n.ch" <andrew@...n.ch>, "olteanv@...il.com"
	<olteanv@...il.com>, "daniel.klauer@....de" <daniel.klauer@....de>,
	"davem@...emloft.net" <davem@...emloft.net>, "vivien.didelot@...il.com"
	<vivien.didelot@...il.com>, "LinoSanfilippo@....de" <LinoSanfilippo@....de>,
	"f.fainelli@...il.com" <f.fainelli@...il.com>, "kuba@...nel.org"
	<kuba@...nel.org>, "rafael.richter@....de" <rafael.richter@....de>
Subject: Re: [PATCH net] net: dsa: fix panic when DSA master device unbinds on
 shutdown

Hello Vladimir!

On Wed, 2022-02-09 at 14:04 +0200, Vladimir Oltean wrote:
> Rafael reports that on a system with LX2160A and Marvell DSA switches,
> if a reboot occurs while the DSA master (dpaa2-eth) is up, the following
> panic can be seen:
> 
> systemd-shutdown[1]: Rebooting.
> Unable to handle kernel paging request at virtual address 00a0000800000041
> [00a0000800000041] address between user and kernel address ranges
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00042-g8f5585009b24 #32
> pc : dsa_slave_netdevice_event+0x130/0x3e4
> lr : raw_notifier_call_chain+0x50/0x6c
> Call trace:
>  dsa_slave_netdevice_event+0x130/0x3e4
>  raw_notifier_call_chain+0x50/0x6c
>  call_netdevice_notifiers_info+0x54/0xa0
>  __dev_close_many+0x50/0x130
>  dev_close_many+0x84/0x120
>  unregister_netdevice_many+0x130/0x710
>  unregister_netdevice_queue+0x8c/0xd0
>  unregister_netdev+0x20/0x30
>  dpaa2_eth_remove+0x68/0x190
>  fsl_mc_driver_remove+0x20/0x5c
>  __device_release_driver+0x21c/0x220
>  device_release_driver_internal+0xac/0xb0
>  device_links_unbind_consumers+0xd4/0x100
>  __device_release_driver+0x94/0x220
>  device_release_driver+0x28/0x40
>  bus_remove_device+0x118/0x124
>  device_del+0x174/0x420
>  fsl_mc_device_remove+0x24/0x40
>  __fsl_mc_device_remove+0xc/0x20
>  device_for_each_child+0x58/0xa0
>  dprc_remove+0x90/0xb0
>  fsl_mc_driver_remove+0x20/0x5c
>  __device_release_driver+0x21c/0x220
>  device_release_driver+0x28/0x40
>  bus_remove_device+0x118/0x124
>  device_del+0x174/0x420
>  fsl_mc_bus_remove+0x80/0x100
>  fsl_mc_bus_shutdown+0xc/0x1c
>  platform_shutdown+0x20/0x30
>  device_shutdown+0x154/0x330
>  __do_sys_reboot+0x1cc/0x250
>  __arm64_sys_reboot+0x20/0x30
>  invoke_syscall.constprop.0+0x4c/0xe0
>  do_el0_svc+0x4c/0x150
>  el0_svc+0x24/0xb0
>  el0t_64_sync_handler+0xa8/0xb0
>  el0t_64_sync+0x178/0x17c
> 
> It can be seen from the stack trace that the problem is that the
> deregistration of the master causes a dev_close(), which gets notified
> as NETDEV_GOING_DOWN to dsa_slave_netdevice_event().
> But dsa_switch_shutdown() has already run, and this has unregistered the
> DSA slave interfaces, and yet, the NETDEV_GOING_DOWN handler attempts to
> call dev_close_many() on those slave interfaces, leading to the problem.
> 
> The previous attempt to avoid the NETDEV_GOING_DOWN on the master after
> dsa_switch_shutdown() was called seems improper. Unregistering the slave
> interfaces is unnecessary and unhelpful. Instead, after the slaves have
> stopped being uppers of the DSA master, we can now reset to NULL the
> master->dsa_ptr pointer, which will make DSA start ignoring all future
> notifier events on the master.
> 
> Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
> Reported-by: Rafael Richter <rafael.richter@....de>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> ---
>  net/dsa/dsa2.c | 25 ++++++-------------------
>  1 file changed, 6 insertions(+), 19 deletions(-)
> 
> diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
> index 909b045c9b11..e498c927c3d0 100644
> --- a/net/dsa/dsa2.c
> +++ b/net/dsa/dsa2.c
> @@ -1784,7 +1784,6 @@ EXPORT_SYMBOL_GPL(dsa_unregister_switch);
>  void dsa_switch_shutdown(struct dsa_switch *ds)
>  {
>  	struct net_device *master, *slave_dev;
> -	LIST_HEAD(unregister_list);
>  	struct dsa_port *dp;
>  
>  	mutex_lock(&dsa2_mutex);
> @@ -1795,25 +1794,13 @@ void dsa_switch_shutdown(struct dsa_switch *ds)
>  		slave_dev = dp->slave;
>  
>  		netdev_upper_dev_unlink(master, slave_dev);
> -		/* Just unlinking ourselves as uppers of the master is not
> -		 * sufficient. When the master net device unregisters, that will
> -		 * also call dev_close, which we will catch as NETDEV_GOING_DOWN
> -		 * and trigger a dev_close on our own devices (dsa_slave_close).
> -		 * In turn, that will call dev_mc_unsync on the master's net
> -		 * device. If the master is also a DSA switch port, this will
> -		 * trigger dsa_slave_set_rx_mode which will call dev_mc_sync on
> -		 * its own master. Lockdep will complain about the fact that
> -		 * all cascaded masters have the same dsa_master_addr_list_lock_key,
> -		 * which it normally would not do if the cascaded masters would
> -		 * be in a proper upper/lower relationship, which we've just
> -		 * destroyed.
> -		 * To suppress the lockdep warnings, let's actually unregister
> -		 * the DSA slave interfaces too, to avoid the nonsensical
> -		 * multicast address list synchronization on shutdown.
> -		 */
> -		unregister_netdevice_queue(slave_dev, &unregister_list);
>  	}
> -	unregister_netdevice_many(&unregister_list);
> +
> +	/* Disconnect from further netdevice notifiers on the master,
> +	 * since netdev_uses_dsa() will now return false.
> +	 */
> +	dsa_switch_for_each_cpu_port(dp, ds)
> +		dp->master->dsa_ptr = NULL;

This is unfortunately racy and leads to other panics:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G           O       6.1.99+gitb7793b7d9b35 #1
pc : lan9303_rcv+0x64/0x210
lr : lan9303_rcv+0x148/0x210
Call trace:
 lan9303_rcv+0x64/0x210
 dsa_switch_rcv+0x1d8/0x350
 __netif_receive_skb_list_core+0x1f8/0x220
 netif_receive_skb_list_internal+0x18c/0x2a4
 napi_gro_receive+0x238/0x254
 fec_enet_rx_napi+0x830/0xe60
 __napi_poll+0x40/0x210
 net_rx_action+0x138/0x2d0

Even though dsa_switch_rcv() checks 

        if (unlikely(!cpu_dp)) {
                kfree_skb(skb);
                return 0;
        }

if dsa_switch_shutdown() happens to zero dsa_ptr before
dsa_conduit_find_user(dev, 0, port) call, the latter will dereference dsa_ptr==NULL:

static inline struct net_device *dsa_conduit_find_user(struct net_device *dev,
                                                       int device, int port)
{
        struct dsa_port *cpu_dp = dev->dsa_ptr;
        struct dsa_switch_tree *dst = cpu_dp->dst;

I believe there are other race patterns as well if we consider all possible

static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
                          struct packet_type *pt, struct net_device *unused)
{
        struct metadata_dst *md_dst = skb_metadata_dst(skb);
        struct dsa_port *cpu_dp = dev->dsa_ptr;

...

                nskb = cpu_dp->rcv(skb, dev);

>  
>  	rtnl_unlock();
>  	mutex_unlock(&dsa2_mutex);

I'm not sure there is a safe way to zero dsa_ptr without ensuring the port
is down and there is no ongoing receive in parallel.

-- 
Alexander Sverdlin
Siemens AG
www.siemens.com