[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ebfe43c6-a576-ba33-11e0-a2d143b56de3@gmail.com>
Date: Mon, 7 Sep 2020 20:13:00 -0700
From: Florian Fainelli <f.fainelli@...il.com>
To: Vladimir Oltean <olteanv@...il.com>, kuba@...nel.org
Cc: vivien.didelot@...il.com, andrew@...n.ch, xiyou.wangcong@...il.com,
ap420073@...il.com, netdev@...r.kernel.org
Subject: Re: [PATCH v2 net] net: dsa: link interfaces with the DSA master to
get rid of lockdep warnings
On 9/7/2020 4:48 PM, Vladimir Oltean wrote:
> Since commit 845e0ebb4408 ("net: change addr_list_lock back to static
> key"), cascaded DSA setups (DSA switch port as DSA master for another
> DSA switch port) are emitting this lockdep warning:
>
> ============================================
> WARNING: possible recursive locking detected
> 5.8.0-rc1-00133-g923e4b5032dd-dirty #208 Not tainted
> --------------------------------------------
> dhcpcd/323 is trying to acquire lock:
> ffff000066dd4268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
>
> but task is already holding lock:
> ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&dsa_master_addr_list_lock_key/1);
> lock(&dsa_master_addr_list_lock_key/1);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 3 locks held by dhcpcd/323:
> #0: ffffdbd1381dda18 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30
> #1: ffff00006614b268 (_xmit_ETHER){+...}-{2:2}, at: dev_set_rx_mode+0x28/0x48
> #2: ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90
>
> stack backtrace:
> Call trace:
> dump_backtrace+0x0/0x1e0
> show_stack+0x20/0x30
> dump_stack+0xec/0x158
> __lock_acquire+0xca0/0x2398
> lock_acquire+0xe8/0x440
> _raw_spin_lock_nested+0x64/0x90
> dev_mc_sync+0x44/0x90
> dsa_slave_set_rx_mode+0x34/0x50
> __dev_set_rx_mode+0x60/0xa0
> dev_mc_sync+0x84/0x90
> dsa_slave_set_rx_mode+0x34/0x50
> __dev_set_rx_mode+0x60/0xa0
> dev_set_rx_mode+0x30/0x48
> __dev_open+0x10c/0x180
> __dev_change_flags+0x170/0x1c8
> dev_change_flags+0x2c/0x70
> devinet_ioctl+0x774/0x878
> inet_ioctl+0x348/0x3b0
> sock_do_ioctl+0x50/0x310
> sock_ioctl+0x1f8/0x580
> ksys_ioctl+0xb0/0xf0
> __arm64_sys_ioctl+0x28/0x38
> el0_svc_common.constprop.0+0x7c/0x180
> do_el0_svc+0x2c/0x98
> el0_sync_handler+0x9c/0x1b8
> el0_sync+0x158/0x180
>
> Since DSA never made use of the netdev API for describing links between
> upper devices and lower devices, the dev->lower_level value of a DSA
> switch interface would be 1, which would warn when it is a DSA master.
>
> We can use netdev_upper_dev_link() to describe the relationship between
> a DSA slave and a DSA master. To be precise, a DSA "slave" (switch port)
> is an "upper" to a DSA "master" (host port). The relationship is "many
> uppers to one lower", like in the case of VLAN. So, for that reason, we
> use the same function as VLAN uses.
>
> There might be a chance that somebody will try to take hold of this
> interface and use it immediately after register_netdev() and before
> netdev_upper_dev_link(). To avoid that, we do the registration and
> linkage while holding the RTNL, and we use the RTNL-locked cousin of
> register_netdev(), which is register_netdevice().
>
> Since this warning was not there when lockdep was using dynamic keys for
> addr_list_lock, we are blaming the lockdep patch itself. The network
> stack _has_ been using static lockdep keys before, and it _is_ likely
> that stacked DSA setups have been triggering these lockdep warnings
> since forever, however I can't test very old kernels on this particular
> stacked DSA setup, to ensure I'm not in fact introducing regressions.
>
> Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key")
> Suggested-by: Cong Wang <xiyou.wangcong@...il.com>
> Signed-off-by: Vladimir Oltean <olteanv@...il.com>
I was worried for a second that dsa_slave_notify(slave_dev,
DSA_PORT_UNREGISTER) would no longer work after your changes, but we are
only providing references to both slave_dev and master and this actually
makes the unregister symmetrical with the register whereby the slave_dev
reference is not in a registered state for both events.
Reviewed-by: Florian Fainelli <f.fainelli@...il.com>
--
Florian
Powered by blists - more mailing lists