[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z1vfsAyuxcohT7th@fedora>
Date: Fri, 13 Dec 2024 07:18:08 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, Jay Vosburgh <jv@...sburgh.net>,
Andy Gospodarek <andy@...yhouse.net>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Simon Horman <horms@...nel.org>, Jianbo Liu <jianbol@...dia.com>,
Tariq Toukan <tariqt@...dia.com>,
Andrew Lunn <andrew+netdev@...n.ch>, Shuah Khan <shuah@...nel.org>,
linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 0/2] bond: fix xfrm offload feature during init
On Thu, Dec 12, 2024 at 06:27:34AM -0800, Jakub Kicinski wrote:
> On Wed, 11 Dec 2024 07:11:25 +0000 Hangbin Liu wrote:
> > The first patch fixes the xfrm offload feature during setup active-backup
> > mode. The second patch add a ipsec offload testing.
>
> Looks like the test is too good, is there a fix pending somewhere for
> the BUG below? We can't merge the test before that:
This should be a regression of 2aeeef906d5a ("bonding: change ipsec_lock from
spin lock to mutex"). As in xfrm_state_delete we called spin_lock_bh(&x->lock)
for the xfrm state delete.
But I'm not sure if it's proper to release the spin lock in bond code.
This seems too specific.
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 7daeab67e7b5..69563bc958ca 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -592,6 +592,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
real_dev->xfrmdev_ops->xdo_dev_state_delete(xs);
out:
netdev_put(real_dev, &tracker);
+ spin_unlock_bh(&xs->lock);
mutex_lock(&bond->ipsec_lock);
list_for_each_entry(ipsec, &bond->ipsec_list, list) {
if (ipsec->xs == xs) {
@@ -601,6 +602,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
}
}
mutex_unlock(&bond->ipsec_lock);
+ spin_lock_bh(&xs->lock);
}
What do you think?
Thanks
Hangbin
>
> https://netdev-3.bots.linux.dev/vmksft-bonding-dbg/results/900082/11-bond-ipsec-offload-sh/stderr
>
> [ 859.672652][ C3] bond_xfrm_update_stats: eth0 doesn't support xdo_dev_state_update_stats
> [ 860.467189][ T8677] bond0: (slave eth0): link status definitely down, disabling slave
> [ 860.467664][ T8677] bond0: (slave eth1): making interface the new active one
> [ 860.831042][ T9677] bond_xfrm_update_stats: eth1 doesn't support xdo_dev_state_update_stats
> [ 862.195271][ T9683] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
> [ 862.195880][ T9683] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9683, name: ip
> [ 862.196189][ T9683] preempt_count: 201, expected: 0
> [ 862.196396][ T9683] RCU nest depth: 0, expected: 0
> [ 862.196591][ T9683] 2 locks held by ip/9683:
> [ 862.196818][ T9683] #0: ffff88800a829558 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{4:4}, at: xfrm_netlink_rcv+0x65/0x90 [xfrm_user]
> [ 862.197264][ T9683] #1: ffff88800f460548 (&x->lock){+.-.}-{3:3}, at: xfrm_state_flush+0x1b3/0x3a0
> [ 862.197629][ T9683] CPU: 3 UID: 0 PID: 9683 Comm: ip Not tainted 6.13.0-rc1-virtme #1
> [ 862.197967][ T9683] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 862.198204][ T9683] Call Trace:
> [ 862.198352][ T9683] <TASK>
> [ 862.198458][ T9683] dump_stack_lvl+0xb0/0xd0
> [ 862.198659][ T9683] __might_resched+0x2f8/0x530
> [ 862.198852][ T9683] ? kfree+0x2d/0x330
> [ 862.199005][ T9683] __mutex_lock+0xd9/0xbc0
> [ 862.199202][ T9683] ? ref_tracker_free+0x35e/0x910
> [ 862.199401][ T9683] ? bond_ipsec_del_sa+0x2c1/0x790
> [ 862.199937][ T9683] ? find_held_lock+0x2c/0x110
> [ 862.200133][ T9683] ? __pfx___mutex_lock+0x10/0x10
> [ 862.200329][ T9683] ? bond_ipsec_del_sa+0x280/0x790
> [ 862.200519][ T9683] ? xfrm_dev_state_delete+0x97/0x170
> [ 862.200711][ T9683] ? __xfrm_state_delete+0x681/0x8e0
> [ 862.200907][ T9683] ? xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user]
> [ 862.201151][ T9683] ? netlink_rcv_skb+0x130/0x360
> [ 862.201347][ T9683] ? xfrm_netlink_rcv+0x74/0x90 [xfrm_user]
> [ 862.201587][ T9683] ? netlink_unicast+0x44b/0x710
> [ 862.201780][ T9683] ? netlink_sendmsg+0x723/0xbe0
> [ 862.201973][ T9683] ? ____sys_sendmsg+0x7ac/0xa10
> [ 862.202164][ T9683] ? ___sys_sendmsg+0xee/0x170
> [ 862.202355][ T9683] ? __sys_sendmsg+0x109/0x1a0
> [ 862.202546][ T9683] ? do_syscall_64+0xc1/0x1d0
> [ 862.202738][ T9683] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [ 862.202986][ T9683] ? __pfx_nsim_ipsec_del_sa+0x10/0x10 [netdevsim]
> [ 862.203251][ T9683] ? bond_ipsec_del_sa+0x2c1/0x790
> [ 862.203457][ T9683] bond_ipsec_del_sa+0x2c1/0x790
> [ 862.203648][ T9683] ? __pfx_lock_acquire.part.0+0x10/0x10
> [ 862.203845][ T9683] ? __pfx_bond_ipsec_del_sa+0x10/0x10
> [ 862.204034][ T9683] ? do_raw_spin_lock+0x131/0x270
> [ 862.204225][ T9683] ? __pfx_do_raw_spin_lock+0x10/0x10
> [ 862.204468][ T9683] xfrm_dev_state_delete+0x97/0x170
> [ 862.204665][ T9683] __xfrm_state_delete+0x681/0x8e0
> [ 862.204858][ T9683] xfrm_state_flush+0x1bb/0x3a0
> [ 862.205057][ T9683] xfrm_flush_sa+0xf0/0x270 [xfrm_user]
> [ 862.205290][ T9683] ? __pfx_xfrm_flush_sa+0x10/0x10 [xfrm_user]
> [ 862.205537][ T9683] ? __nla_validate_parse+0x48/0x3d0
> [ 862.205744][ T9683] xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user]
> [ 862.205985][ T9683] ? __pfx___lock_release+0x10/0x10
> [ 862.206174][ T9683] ? __pfx_xfrm_user_rcv_msg+0x10/0x10 [xfrm_user]
> [ 862.206412][ T9683] ? __pfx_validate_chain+0x10/0x10
> [ 862.206614][ T9683] ? hlock_class+0x4e/0x130
> [ 862.206807][ T9683] ? mark_lock+0x38/0x3e0
> [ 862.206986][ T9683] ? __mutex_trylock_common+0xfa/0x260
> [ 862.207181][ T9683] ? __pfx___mutex_trylock_common+0x10/0x10
> [ 862.207425][ T9683] netlink_rcv_skb+0x130/0x360
Powered by blists - more mailing lists