lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRcnDwyMn11TfRUG@krikkit>
Date: Fri, 14 Nov 2025 13:56:47 +0100
From: Sabrina Dubroca <sd@...asysnail.net>
To: Cosmin Ratiu <cratiu@...dia.com>
Cc: netdev@...r.kernel.org, Jay Vosburgh <jv@...sburgh.net>,
	Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>, Taehee Yoo <ap420073@...il.com>,
	Jianbo Liu <jianbol@...dia.com>,
	Steffen Klassert <steffen.klassert@...unet.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Leon Romanovsky <leonro@...dia.com>
Subject: Re: [PATCH ipsec v2 1/2] bond: Use xfrm_state_migrate to migrate SAs

2025-11-13, 12:43:09 +0200, Cosmin Ratiu wrote:
> The bonding driver manages offloaded SAs using the following strategy:
> 
> An xfrm_state offloaded on the bond device with bond_ipsec_add_sa() uses
> 'real_dev' on the xfrm_state xs to redirect the offload to the current
> active slave. The corresponding bond_ipsec_del_sa() (called with the xs
> spinlock held) redirects the unoffload call to real_dev.


> Finally,
> cleanup happens in bond_ipsec_free_sa(), which removes the offload from
> the device. Since the last call happens without the xs spinlock held,
> that is where the real work to unoffload actually happens.

Not on all devices (some don't even implement xdo_dev_state_free).


> 
> When the active slave changes to a new device a 3-step process is used
> to migrate all xfrm states to the new device:
>
> 1. bond_ipsec_del_sa_all() unoffloads all states in bond->ipsec_list
>    from the previously active device.
> 2. The active slave is flipped to the new device.
> 3. bond_ipsec_add_sa_all() offloads all states in bond->ipsec_list to
>    the new device.
> 
> This patch closes a race that could happen between xfrm_state migration
> and TX, which could result in unencrypted packets going out the wire:
> CPU1 (xfrm_output)                   CPU2 (bond_change_active_slave)
> bond_ipsec_offload_ok -> true
>                                      bond_ipsec_del_sa_all
> bond_xmit_activebackup
> bond_dev_queue_xmit
> dev_queue_xmit on old_dev
> 				     bond->curr_active_slave = new_dev
> 				     bond_ipsec_add_sa_all
> 
> So the packet makes it out to old_dev after the offloaded xfrm_state is
> deleted from it. The result: an unencrypted IPSec packet on the wire.
> 
> With the new approach, in-use states on old_dev will not be deleted
> until in-flight packets are transmitted.

How does this guarantee it? It would be good to describe how the new
approach closes the race with a bit more info than "use
xfrm_state_migrate".

And I don't think we currently guarantee that packets using offload
will be fully transmitted before xdo_dev_state_delete was called in
case of deletion. But ok, the bond case is worse due to the add/delete
dance when we change the active slave (and there's still the possible
issue Steffen mentioned a while ago, that this delete/add dance may
not be valid at all depending on how the HW behaves wrt IVs).


> It also makes for cleaner
> bonding code, which no longer needs to care about xfrm_state management
> so much.

But using the migrate code for that feels kind of hacky, and the 2nd
patch in this set also looks quite hacky.

And doing all that without protection against admin operations on the
xfrm state objects doesn't seem safe.


Thinking about the migrate behavior, if we fail to create/offload the
new state:
 - old state will be deleted
 - new state won't be created

So any packet we send afterwards that would need to use this SA will
just get dropped? (while the old behavior was "no more offload until
we change the active slave again"?)


> Fixes: ("ec13009472f4 bonding: implement xdo_dev_state_free and call it after deletion")

nit: wrong formatting of the Fixes tag


[just one comment on the diff, I'll look at it again if we decide to
proceed with this patch]

> @@ -533,36 +535,42 @@ static void bond_ipsec_add_sa_all(struct bonding *bond)
>  			slave_warn(bond_dev, real_dev,
>  				   "%s: no slave xdo_dev_state_add\n",
>  				   __func__);
> -		goto out;
> +		return;
>  	}
>  
> -	list_for_each_entry(ipsec, &bond->ipsec_list, list) {
> -		/* If new state is added before ipsec_lock acquired */
> -		if (ipsec->xs->xso.real_dev == real_dev)
> -			continue;
> +	/* Prepare the list of xfrm_states to be migrated. */
> +	mutex_lock(&bond->ipsec_lock);
> +	list_splice_init(&bond->ipsec_list, &ipsec_list);
> +	/* Add back states already offloaded on the new device before the
> +	 * lock was acquired and hold all remaining states to avoid them
> +	 * getting deleted during the migration.

Even with hold(), they could still be deleted (but not destroyed)?

> +	 */
> +	list_for_each_entry_safe(ipsec, tmp, &ipsec_list, list) {
> +		if (unlikely(ipsec->xs->xso.real_dev == real_dev))
> +			list_move_tail(&ipsec->list, &bond->ipsec_list);
> +		else
> +			xfrm_state_hold(ipsec->xs);
> +	}
> +	mutex_unlock(&bond->ipsec_lock);

-- 
Sabrina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ