lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aTJWI3aybYO-NHg5@fedora>
Date: Fri, 5 Dec 2025 03:48:51 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Cosmin Ratiu <cratiu@...dia.com>
Cc: netdev@...r.kernel.org, Jay Vosburgh <jv@...sburgh.net>,
	Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>, Taehee Yoo <ap420073@...il.com>,
	Jianbo Liu <jianbol@...dia.com>,
	Sabrina Dubroca <sd@...asysnail.net>,
	Steffen Klassert <steffen.klassert@...unet.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Leon Romanovsky <leonro@...dia.com>
Subject: Re: [RFC PATCH ipsec 2/2] bonding: Maintain offloaded xfrm on all
 devices

On Fri, Nov 21, 2025 at 05:16:44PM +0200, Cosmin Ratiu wrote:
> The bonding driver manages offloaded SAs using the following strategy:
> 
> An xfrm_state offloaded on the bond device with bond_ipsec_add_sa() uses
> 'real_dev' on the xfrm_state xs to redirect the offload to the current
> active slave. The corresponding bond_ipsec_del_sa() (called with the xs
> spinlock held) redirects the unoffload call to real_dev. Finally,
> cleanup happens in bond_ipsec_free_sa(), which removes the offload from
> the device. Since the last call happens without the xs spinlock held,
> that is where the real work to unoffload actually happens.
> 
> When the active slave changes to a new device a 3-step process is used
> to migrate all xfrm states to the new device:
> 1. bond_ipsec_del_sa_all() unoffloads all states in bond->ipsec_list
>    from the previously active device.
> 2. The active slave is flipped to the new device.
> 3. bond_ipsec_add_sa_all() offloads all states in bond->ipsec_list to
>    the new device.
> 
> There can be two races which result in unencrypted IPSec packets being
> transmitted on the wire:
> 
> 1. Unencrypted IPSec packet on old_dev:
> CPU1 (xfrm_output)                   CPU2 (bond_change_active_slave)
> bond_ipsec_offload_ok -> true
>                                      bond_ipsec_del_sa_all
> bond_xmit_activebackup
> bond_dev_queue_xmit
> dev_queue_xmit on old_dev
> 				     bond->curr_active_slave = new_dev
> 				     bond_ipsec_add_sa_all
> 
> 2. Unencrypted IPSec packet on new_dev:
> CPU1 (xfrm_output)                   CPU2 (bond_change_active_slave)
> bond_ipsec_offload_ok -> true
>                                      bond->curr_active_slave = new_dev
>                                      bond_ipsec_migrate_sa_all
> bond_xmit_activebackup
> bond_dev_queue_xmit
> dev_queue_xmit on new_dev
> 				     bond_ipsec_migrate_sa_all finishes
> 
> This patch fixes both these issues. Bonding now maintain SAs on all
> devices by making use of the previous patch that allows the same xfrm
> state to be offloaded on multiple devices. This consists of:
> 
> 1. Maintaining two linked lists:
> - bond->ipsec_list is the list of xfrm states offloaded to the bonding
>   device.
> - Each slave has its own bond->ipsec_offloads list holding offloads of
>   bond->ipsec_list on that slave.
> These lists are protected by the existing bond->ipsec_lock mutex.
> 
> 2. When a slave is added (bond_enslave), bond_ipsec_add_sa_all now
>    offloads all xfrm states to the new device.
> 
> 3. When a slave is removed (__bond_release_one), bond_ipsec_del_sa_all
>    now removes all xfrm state offloads from that device.
> 
> 4. When the active slave is changed (bond_change_active_slave), a new
>    bond_ipsec_migrate_sa_all function switches xs->xso.real_dev and
>    xs->xso.offload handle for all offloaded xfrm states.
>    xdo_dev_state_advance_esn is also called on the new device to update
>    the esn state.
> 
> 5. Adding an offloaded xfrm state to the bond device must now iterate
>    through active slaves. To make that nice, RTNL is grabbed there. The
>    alternative is repeatedly grabbing each slave under the RCU lock,
>    holding it, releasing the lock to be able to offload a state, then
>    re-grabbing the RCU lock and releasing the slave. RTNL seems cleaner.
> 
> 6. bond_ipsec_del_sa (.xdo_dev_state_delete for bond) is unchanged, it
>    now only deletes the state from the active device and leaves the rest
>    for the xdo_dev_state_free callback, which can grab the required
>    locks.
> 
> Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA")
> Signed-off-by: Cosmin Ratiu <cratiu@...dia.com>
> ---
>  drivers/net/bonding/bond_main.c | 283 +++++++++++++++++---------------
>  include/net/bonding.h           |  22 ++-
>  2 files changed, 164 insertions(+), 141 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 4c5b73786877..979e5aabf8d2 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -452,6 +452,61 @@ static struct net_device *bond_ipsec_dev(struct xfrm_state *xs)
>  	return slave->dev;
>  }
>  
> +static struct bond_ipsec_offload*
> +bond_ipsec_dev_add_sa(struct net_device *dev, struct bond_ipsec *ipsec,
> +		      struct netlink_ext_ack *extack)
> +{
> +	struct bond_ipsec_offload *offload;
> +	int err;
> +
> +	if (!dev->xfrmdev_ops ||
> +	    !dev->xfrmdev_ops->xdo_dev_state_add ||
> +	    netif_is_bond_master(dev)) {
> +		NL_SET_ERR_MSG_MOD(extack,
> +				   "Slave does not support ipsec offload");
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	offload = kzalloc(sizeof(*offload), GFP_KERNEL);
> +	if (!offload)
> +		return ERR_PTR(-ENOMEM);
> +
> +	offload->ipsec = ipsec;
> +	offload->dev = dev;
> +	err = dev->xfrmdev_ops->xdo_dev_state_add(dev, ipsec->xs,
> +						   &offload->handle, extack);
> +	if (err)

Here we need to free the offload.

> +		return ERR_PTR(err);
> +	return offload;
> +}
> +
> +static void bond_ipsec_dev_del_sa(struct bond_ipsec_offload *offload)
> +{
> +	struct xfrm_state *xs = offload->ipsec->xs;
> +	struct net_device *dev = offload->dev;
> +
> +	if (dev->xfrmdev_ops->xdo_dev_state_delete) {
> +		spin_lock_bh(&xs->lock);
> +		/* Don't double delete states killed by the user
> +		 * from xs->xso.real_dev.
> +		 */
> +		if (dev != xs->xso.real_dev ||
> +		    xs->km.state != XFRM_STATE_DEAD)
> +			dev->xfrmdev_ops->xdo_dev_state_delete(dev, xs,
> +							       offload->handle);
> +		if (xs->xso.real_dev == dev)
> +			xs->xso.real_dev = NULL;
> +		spin_unlock_bh(&xs->lock);
> +	}
> +
> +	if (dev->xfrmdev_ops->xdo_dev_state_free)
> +		dev->xfrmdev_ops->xdo_dev_state_free(dev, xs, offload->handle);
> +
> +	list_del(&offload->list);
> +	list_del(&offload->ipsec_list);
> +	kfree(offload);
> +}
> +
[...]

> -static void bond_ipsec_add_sa_all(struct bonding *bond)
> +static void bond_ipsec_add_sa_all(struct bonding *bond, struct slave *new_slave,
> +				  struct netlink_ext_ack *extack)
>  {
> +	struct net_device *real_dev = new_slave->dev;
>  	struct net_device *bond_dev = bond->dev;
> -	struct net_device *real_dev;
>  	struct bond_ipsec *ipsec;
>  	struct slave *slave;
>  
> -	slave = rtnl_dereference(bond->curr_active_slave);
> -	real_dev = slave ? slave->dev : NULL;
> -	if (!real_dev)
> -		return;
> +	INIT_LIST_HEAD(&new_slave->ipsec_offloads);
>  
>  	mutex_lock(&bond->ipsec_lock);
> -	if (!real_dev->xfrmdev_ops ||
> -	    !real_dev->xfrmdev_ops->xdo_dev_state_add ||
> -	    netif_is_bond_master(real_dev)) {
> -		if (!list_empty(&bond->ipsec_list))
> -			slave_warn(bond_dev, real_dev,
> -				   "%s: no slave xdo_dev_state_add\n",
> -				   __func__);
> -		goto out;
> -	}
> -
>  	list_for_each_entry(ipsec, &bond->ipsec_list, list) {
> +		struct bond_ipsec_offload *offload;
>  		struct xfrm_state *xs = ipsec->xs;
>  
> -		/* If new state is added before ipsec_lock acquired */
> -		if (xs->xso.real_dev == real_dev)
> -			continue;
> -
> -		if (real_dev->xfrmdev_ops->xdo_dev_state_add(real_dev, xs,
> -							     &xs->xso.offload_handle,
> -							     NULL)) {
> +		offload = bond_ipsec_dev_add_sa(slave->dev, ipsec, extack);

Here should be real_dev.

> +		if (IS_ERR(offload)) {
>  			slave_warn(bond_dev, real_dev, "%s: failed to add SA\n", __func__);
>  			continue;
>  		}

If we add offload failed on the slave, what would happen during migrate?

Thanks
Hangbin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ