[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dfbf2c60812423a3bcccc06921efdd987a003d36.camel@nvidia.com>
Date: Thu, 8 May 2025 10:33:48 +0000
From: Cosmin Ratiu <cratiu@...dia.com>
To: "stfomichev@...il.com" <stfomichev@...il.com>
CC: "davem@...emloft.net" <davem@...emloft.net>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>, Dragos
Tatulea <dtatulea@...dia.com>, "sdf@...ichev.me" <sdf@...ichev.me>,
"kuba@...nel.org" <kuba@...nel.org>, "jiri@...nulli.us" <jiri@...nulli.us>,
"edumazet@...gle.com" <edumazet@...gle.com>, "pabeni@...hat.com"
<pabeni@...hat.com>, Saeed Mahameed <saeedm@...dia.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net] net: Lock lower level devices when updating features
On Wed, 2025-05-07 at 14:20 -0700, Stanislav Fomichev wrote:
> On 05/07, Cosmin Ratiu wrote:
> > On Wed, 2025-05-07 at 15:13 +0000, Cosmin Ratiu wrote:
> > > > In any case, please hold off with picking this patch up, it
> > > > seems
> > > > there's a possibility of a real deadlock. Here's the scenario:
>
> Hmm, are you sure you're calling __netdev_update_features on the
> upper?
> I don't see how the lower would be locked in that case. From my POW,
> this is what happens:
>
> 1. your dev (lower) has a vlan on it (upper)
> 2. you call lro=off on the _lower_
> 3. this triggers FEAT_CHANGE notifier and vlan_device_event catches
> it
> 4. since the lower has a vlan device (dev->vlan_info != NULL), it
> goes
> over every other vlan in the group and triggers
> netdev_update_features
> for the upper (netdev_update_features vlandev)
> 5. the upper tries to sync the features into the lower (including the
> one that triggered FEAT_CHANGE) and that's where the deadlock
> happens
>
> But I think (5) should be largely a no-op for the device triggering
> the
> notification, because the features have been already applied in
> ethnl_set_features.
> I'd move the lock into netdev_sync_lower_features, and only after
> checking
> the features (and making sure that we are going to change them). The
> feature
> check might be racy, but I think it should still work?
>
You are right, if I restrict the lower dev critical section to only the
call to __netdev_update_features for the lower dev there's no deadlock
any more, because the device with the lock held already had its
features updated.
I will send a new version of this patch soon after the full regression
suite finishes and I convince myself there are no more issues related
to this that we can encounter.
> Can you also share the bonding stacktrace as well to confirm it's the
> same issue?
Sure, here it is, it's the same scenario. bond_netdev_event gets called
on a slave dev, it recomputes features and updates all slaves
(bond_compute_features), and then the same lock is reacquired.
But this is also fixed with your suggestion above.
============================================
WARNING: possible recursive locking detected
devlink/14341 is trying to acquire lock:
ffff88810ebd8c80 (&dev_instance_lock_key#9){+.+.}-{4:4}, at:
__netdev_update_features+0x31e/0xe20
but task is already holding lock:
ffff88810ebd8c80 (&dev_instance_lock_key#9){+.+.}-{4:4}, at:
mlx5e_attach_netdev+0x31f/0x360 [mlx5_core]
and the lock comparison function returns 0:
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&dev_instance_lock_key#9);
lock(&dev_instance_lock_key#9);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by devlink/14341:
#0: ffffffff830e5a50 (cb_lock){++++}-{4:4}, at: genl_rcv+0x15/0x40
#1: ffff888164a5c250 (&devlink->lock_key){+.+.}-{4:4}, at:
devlink_get_from_attrs_lock+0xbc/0x180
#2: ffffffff830cf708 (rtnl_mutex){+.+.}-{4:4}, at:
mlx5e_attach_netdev+0x30d/0x360 [mlx5_core]
#3: ffff88810ebd8c80 (&dev_instance_lock_key#9){+.+.}-{4:4}, at:
mlx5e_attach_netdev+0x31f/0x360 [mlx5_core]
Call Trace:
<TASK>
dump_stack_lvl+0x69/0xa0
print_deadlock_bug.cold+0xbd/0xca
__lock_acquire+0x163c/0x2f00
lock_acquire+0xd3/0x2e0
__mutex_lock+0x98/0xf10
__netdev_update_features+0x31e/0xe20
netdev_change_features+0x1f/0x60
bond_compute_features+0x24e/0x300 [bonding]
bond_netdev_event+0x2e0/0x400 [bonding]
notifier_call_chain+0x3d/0x100
netdev_update_features+0x52/0x60
mlx5e_attach_netdev+0x32f/0x360 [mlx5_core]
mlx5e_netdev_attach_profile+0x48/0x90 [mlx5_core]
mlx5e_netdev_change_profile+0x90/0xf0 [mlx5_core]
mlx5e_vport_rep_load+0x414/0x490 [mlx5_core]
__esw_offloads_load_rep+0x87/0xd0 [mlx5_core]
mlx5_esw_offloads_rep_load+0x45/0xe0 [mlx5_core]
esw_offloads_enable+0xb7b/0xca0 [mlx5_core]
mlx5_eswitch_enable_locked+0x293/0x430 [mlx5_core]
mlx5_devlink_eswitch_mode_set+0x229/0x620 [mlx5_core]
devlink_nl_eswitch_set_doit+0x60/0xd0
genl_family_rcv_msg_doit+0xe0/0x130
genl_rcv_msg+0x188/0x290
netlink_rcv_skb+0x4b/0xf0
genl_rcv+0x24/0x40
netlink_unicast+0x1e1/0x2c0
netlink_sendmsg+0x210/0x450
Cosmin.
Powered by blists - more mailing lists