lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAN2vEgv2tKCZKJSjLNZHMGxDFycqMAkns=tjG6+Ps+N1PmZQ9g@mail.gmail.com>
Date: Wed, 9 Jul 2025 23:57:27 +0200
From: Erwan Dufour <mrarmonius@...il.com>
To: Hangbin Liu <liuhangbin@...il.com>
Cc: netdev@...r.kernel.org, steffen.klassert@...unet.com, 
	herbert@...dor.apana.org.au, davem@...emloft.net, jv@...sburgh.net, 
	saeedm@...dia.com, tariqt@...dia.com, erwan.dufour@...hings.com, 
	cratiu@...dia.com, leon@...nel.org
Subject: Re: [PATCH net-next v2] xfrm: bonding: Add XFRM packet-offload for active-backup

Hi Liu,

Thank you very much for your feedback,
Sorry for the duplicate — my first email contained HTML, which is
rejected by netdev@...r.kernel.org.

> I see you delete ipsec->xs here. Do you mean to prevent reuse of IV?

Yes—we do need to destroy the SA attached to the bond device.
Destruction is required because we can’t simply swap out the
Initialization Vector (IV) or Salt and re‑attach the SA to the device.
As Steffen noted, re‑using an IV+salt = nonce is a critical security
error, as spelled out in RFC 4106 for AES‑GCM. "For a given key, the
IV MUST NOT repeat" from RFC4106. For this cipher we carve out part of
the key (4 octets for salt) supplied when the SA is created. Later,
when we offload crypto or packet processing, the NIC keeps its own
independent counter on each physical interface, and that counter is
appended to the IV for every packet.

Re‑using the same SA therefore means re‑using the same key and
recreating the same IV while the counter resets to zero, which can
produce repeated IVs and thus a security vulnerability.


> But I can't find you add the xs back to new slave.
> Here you do xdo_dev_policy_add(). What about the xdo_dev_state_add()?

As described in RFC 4106, it is recommended to use Internet Key
Exchange (IKE), as specified in RFC 2409, to ensure the uniqueness of
the key/nonce combination.
In our case, we do not want to re-use an SA whose nonce (salt + IV)
would be repeated for all packets sent over the primary link before
the fallback. To prevent this, our solution is to expire the SA and
let IKE generate a new one.

There are two types of SA expiration in IKE: soft and hard. A soft
expiration signals that the SA should be replaced, but it can still be
used for a short time until it is replaced by IKE or removed by the
kernel. In my case, I chose hard expiration by explicitly deleting the
SA to ensure it is never used on the new link.
Therefore, when an SA is expired, it is not necessarily deleted. The
expiration function simply broadcasts a notification to all processes
listening to XFRM, indicating that the SA needs to be renewed. IKE
will then handle the destruction and replacement of the SA.
Since we expire the SA only after ensuring that the new primary slave
has been selected, we can be confident that when IKE attempts to add a
new SA, it will find a valid real_dev — and the correct one

I tested the new code with IKE charond_systemd which is often used
with strongswan_swanctl. And of course, it's working !



> Here the xdo_dev_state_delete() is called when km.state == XFRM_STATE_DEAD.
> Why we remove this?

This piece of code was used to remove the SA we had added to the
device, in case the device was in the DEAD state. The device could be
in that state if it had been deleted in parallel with the change of
the primary slave. The destruction function on the device would have
failed because real_dev was null at that point.

But as you've seen, in the new code we no longer add the SA to the
device in any case, so there's no need to remove it from the device
since it was never added in the first place.

That’s why I decided to remove this part of the code — it’s no longer
needed and could potentially trigger an error in the
xdo_dev_state_delete function.



I hope I’ve answered your questions and that my responses are clear.



@Steffen Klassert, may I take advantage of your kindness and ask if
you know the reasons why IKE was implemented in userland rather than
in the kernel? Since it's a standardized protocol, I thought it could
have been part of the kernel(RFC 2409).

Thanks,

Best regards,

Le mer. 9 juil. 2025 à 04:01, Hangbin Liu <liuhangbin@...il.com> a écrit :
>
> Hi Erwan,
>
> On Sun, Jul 06, 2025 at 04:58:04PM +0200, Erwan Dufour wrote:
> > From: Erwan Dufour <erwan.dufour@...hings.com>
> >
> > New features added:
> > - Use of packet offload added for XFRM in active-backup
> > - Behaviour modification when changing primary slave to prevent reuse of IV.
>
> ...
>
> >
> > -static void bond_ipsec_add_sa_all(struct bonding *bond)
> > +static void bond_ipsec_add_sa_sp_all(struct bonding *bond)
> >  {
> >       struct net_device *bond_dev = bond->dev;
> >       struct net_device *real_dev;
> >       struct bond_ipsec *ipsec;
> >       struct slave *slave;
> > +     int err;
> >
> >       slave = rtnl_dereference(bond->curr_active_slave);
> >       real_dev = slave ? slave->dev : NULL;
> >       if (!real_dev)
> >               return;
> >
> > -     mutex_lock(&bond->ipsec_lock);
> > +     mutex_lock(&bond->ipsec_lock_sa);
> >       if (!real_dev->xfrmdev_ops ||
> >           !real_dev->xfrmdev_ops->xdo_dev_state_add ||
> >           netif_is_bond_master(real_dev)) {
> > -             if (!list_empty(&bond->ipsec_list))
> > +             if (!list_empty(&bond->ipsec_list_sa))
> >                       slave_warn(bond_dev, real_dev,
> >                                  "%s: no slave xdo_dev_state_add\n",
> >                                  __func__);
> > -             goto out;
> > +             goto out_sa;
> >       }
> >
> > -     list_for_each_entry(ipsec, &bond->ipsec_list, list) {
> > -             /* If new state is added before ipsec_lock acquired */
> > +     list_for_each_entry(ipsec, &bond->ipsec_list_sa, list) {
> > +             /* If new state is added before ipsec_lock_sa acquired */
> >               if (ipsec->xs->xso.real_dev == real_dev)
> >                       continue;
> >
> > -             if (real_dev->xfrmdev_ops->xdo_dev_state_add(real_dev,
> > -                                                          ipsec->xs, NULL)) {
> > -                     slave_warn(bond_dev, real_dev, "%s: failed to add SA\n", __func__);
> > +             err = __xfrm_state_delete(ipsec->xs);
> > +             if (!err)
> > +                     km_state_expired(ipsec->xs, 1, 0);
> > +
> > +             xfrm_audit_state_delete(ipsec->xs, err ? 0 : 1, true);
>
> I see you delete ipsec->xs here. Do you mean to prevent reuse of IV?
> But I can't find you add the xs back to new slave.
>
> > +     }
> > +out_sa:
> > +     mutex_unlock(&bond->ipsec_lock_sa);
> > +
> > +     mutex_lock(&bond->ipsec_lock_sp);
> > +     if (!real_dev->xfrmdev_ops ||
> > +         !real_dev->xfrmdev_ops->xdo_dev_policy_add ||
> > +         netif_is_bond_master(real_dev)) {
> > +             if (!list_empty(&bond->ipsec_list_sp))
> > +                     slave_warn(bond_dev, real_dev,
> > +                                "%s: no slave xdo_dev_policy_add\n",
> > +                                __func__);
> > +             goto out_sp;
> > +     }
> > +     list_for_each_entry(ipsec, &bond->ipsec_list_sp, list) {
> > +             if (ipsec->xp->xdo.real_dev == real_dev)
> > +                     continue;
> > +
> > +             if (real_dev->xfrmdev_ops->xdo_dev_policy_add(real_dev,
> > +                                                           ipsec->xp,
> > +                                                           NULL)) {
> > +                     slave_warn(bond_dev, real_dev,
> > +                                "%s: failed to add SP\n", __func__);
> >                       continue;
>
> Here you do xdo_dev_policy_add(). What about the xdo_dev_state_add()?
>
> >               }
> >
> > -             spin_lock_bh(&ipsec->xs->lock);
> > -             /* xs might have been killed by the user during the migration
> > -              * to the new dev, but bond_ipsec_del_sa() should have done
> > -              * nothing, as xso.real_dev is NULL.
> > -              * Delete it from the device we just added it to. The pending
> > -              * bond_ipsec_free_sa() call will do the rest of the cleanup.
> > -              */
> > -             if (ipsec->xs->km.state == XFRM_STATE_DEAD &&
> > -                 real_dev->xfrmdev_ops->xdo_dev_state_delete)
> > -                     real_dev->xfrmdev_ops->xdo_dev_state_delete(real_dev,
> > -                                                                 ipsec->xs);
>
> Here the xdo_dev_state_delete() is called when km.state == XFRM_STATE_DEAD.
> Why we remove this?
>
> > -             ipsec->xs->xso.real_dev = real_dev;
> > -             spin_unlock_bh(&ipsec->xs->lock);
> > +             write_lock_bh(&ipsec->xp->lock);
> > +             ipsec->xp->xdo.real_dev = real_dev;
> > +             write_unlock_bh(&ipsec->xp->lock);
> >       }
> > -out:
> > -     mutex_unlock(&bond->ipsec_lock);
> > +
> > +out_sp:
> > +     mutex_unlock(&bond->ipsec_lock_sp);
> >  }
>
> Thanks
> Hangbin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ