lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z1vfsAyuxcohT7th@fedora>
Date: Fri, 13 Dec 2024 07:18:08 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, Jay Vosburgh <jv@...sburgh.net>,
	Andy Gospodarek <andy@...yhouse.net>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	Nikolay Aleksandrov <razor@...ckwall.org>,
	Simon Horman <horms@...nel.org>, Jianbo Liu <jianbol@...dia.com>,
	Tariq Toukan <tariqt@...dia.com>,
	Andrew Lunn <andrew+netdev@...n.ch>, Shuah Khan <shuah@...nel.org>,
	linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 0/2] bond: fix xfrm offload feature during init

On Thu, Dec 12, 2024 at 06:27:34AM -0800, Jakub Kicinski wrote:
> On Wed, 11 Dec 2024 07:11:25 +0000 Hangbin Liu wrote:
> > The first patch fixes the xfrm offload feature during setup active-backup
> > mode. The second patch add a ipsec offload testing.
> 
> Looks like the test is too good, is there a fix pending somewhere for
> the BUG below? We can't merge the test before that:

This should be a regression of 2aeeef906d5a ("bonding: change ipsec_lock from
spin lock to mutex"). As in xfrm_state_delete we called spin_lock_bh(&x->lock)
for the xfrm state delete.

But I'm not sure if it's proper to release the spin lock in bond code.
This seems too specific.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 7daeab67e7b5..69563bc958ca 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -592,6 +592,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
 	real_dev->xfrmdev_ops->xdo_dev_state_delete(xs);
 out:
 	netdev_put(real_dev, &tracker);
+	spin_unlock_bh(&xs->lock);
 	mutex_lock(&bond->ipsec_lock);
 	list_for_each_entry(ipsec, &bond->ipsec_list, list) {
 		if (ipsec->xs == xs) {
@@ -601,6 +602,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
 		}
 	}
 	mutex_unlock(&bond->ipsec_lock);
+	spin_lock_bh(&xs->lock);
 }
 

What do you think?

Thanks
Hangbin
> 
> https://netdev-3.bots.linux.dev/vmksft-bonding-dbg/results/900082/11-bond-ipsec-offload-sh/stderr
> 
> [  859.672652][    C3] bond_xfrm_update_stats: eth0 doesn't support xdo_dev_state_update_stats
> [  860.467189][ T8677] bond0: (slave eth0): link status definitely down, disabling slave
> [  860.467664][ T8677] bond0: (slave eth1): making interface the new active one
> [  860.831042][ T9677] bond_xfrm_update_stats: eth1 doesn't support xdo_dev_state_update_stats
> [  862.195271][ T9683] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
> [  862.195880][ T9683] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9683, name: ip
> [  862.196189][ T9683] preempt_count: 201, expected: 0
> [  862.196396][ T9683] RCU nest depth: 0, expected: 0
> [  862.196591][ T9683] 2 locks held by ip/9683:
> [  862.196818][ T9683]  #0: ffff88800a829558 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{4:4}, at: xfrm_netlink_rcv+0x65/0x90 [xfrm_user]
> [  862.197264][ T9683]  #1: ffff88800f460548 (&x->lock){+.-.}-{3:3}, at: xfrm_state_flush+0x1b3/0x3a0
> [  862.197629][ T9683] CPU: 3 UID: 0 PID: 9683 Comm: ip Not tainted 6.13.0-rc1-virtme #1
> [  862.197967][ T9683] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [  862.198204][ T9683] Call Trace:
> [  862.198352][ T9683]  <TASK>
> [  862.198458][ T9683]  dump_stack_lvl+0xb0/0xd0
> [  862.198659][ T9683]  __might_resched+0x2f8/0x530
> [  862.198852][ T9683]  ? kfree+0x2d/0x330
> [  862.199005][ T9683]  __mutex_lock+0xd9/0xbc0
> [  862.199202][ T9683]  ? ref_tracker_free+0x35e/0x910
> [  862.199401][ T9683]  ? bond_ipsec_del_sa+0x2c1/0x790
> [  862.199937][ T9683]  ? find_held_lock+0x2c/0x110
> [  862.200133][ T9683]  ? __pfx___mutex_lock+0x10/0x10
> [  862.200329][ T9683]  ? bond_ipsec_del_sa+0x280/0x790
> [  862.200519][ T9683]  ? xfrm_dev_state_delete+0x97/0x170
> [  862.200711][ T9683]  ? __xfrm_state_delete+0x681/0x8e0
> [  862.200907][ T9683]  ? xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user]
> [  862.201151][ T9683]  ? netlink_rcv_skb+0x130/0x360
> [  862.201347][ T9683]  ? xfrm_netlink_rcv+0x74/0x90 [xfrm_user]
> [  862.201587][ T9683]  ? netlink_unicast+0x44b/0x710
> [  862.201780][ T9683]  ? netlink_sendmsg+0x723/0xbe0
> [  862.201973][ T9683]  ? ____sys_sendmsg+0x7ac/0xa10
> [  862.202164][ T9683]  ? ___sys_sendmsg+0xee/0x170
> [  862.202355][ T9683]  ? __sys_sendmsg+0x109/0x1a0
> [  862.202546][ T9683]  ? do_syscall_64+0xc1/0x1d0
> [  862.202738][ T9683]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [  862.202986][ T9683]  ? __pfx_nsim_ipsec_del_sa+0x10/0x10 [netdevsim]
> [  862.203251][ T9683]  ? bond_ipsec_del_sa+0x2c1/0x790
> [  862.203457][ T9683]  bond_ipsec_del_sa+0x2c1/0x790
> [  862.203648][ T9683]  ? __pfx_lock_acquire.part.0+0x10/0x10
> [  862.203845][ T9683]  ? __pfx_bond_ipsec_del_sa+0x10/0x10
> [  862.204034][ T9683]  ? do_raw_spin_lock+0x131/0x270
> [  862.204225][ T9683]  ? __pfx_do_raw_spin_lock+0x10/0x10
> [  862.204468][ T9683]  xfrm_dev_state_delete+0x97/0x170
> [  862.204665][ T9683]  __xfrm_state_delete+0x681/0x8e0
> [  862.204858][ T9683]  xfrm_state_flush+0x1bb/0x3a0
> [  862.205057][ T9683]  xfrm_flush_sa+0xf0/0x270 [xfrm_user]
> [  862.205290][ T9683]  ? __pfx_xfrm_flush_sa+0x10/0x10 [xfrm_user]
> [  862.205537][ T9683]  ? __nla_validate_parse+0x48/0x3d0
> [  862.205744][ T9683]  xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user]
> [  862.205985][ T9683]  ? __pfx___lock_release+0x10/0x10
> [  862.206174][ T9683]  ? __pfx_xfrm_user_rcv_msg+0x10/0x10 [xfrm_user]
> [  862.206412][ T9683]  ? __pfx_validate_chain+0x10/0x10
> [  862.206614][ T9683]  ? hlock_class+0x4e/0x130
> [  862.206807][ T9683]  ? mark_lock+0x38/0x3e0
> [  862.206986][ T9683]  ? __mutex_trylock_common+0xfa/0x260
> [  862.207181][ T9683]  ? __pfx___mutex_trylock_common+0x10/0x10
> [  862.207425][ T9683]  netlink_rcv_skb+0x130/0x360

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ