netdev - Re: [PATCHv2 net] bonding: fix multicast MAC address synchronization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aJwO3vcLipougMid@fedora>
Date: Wed, 13 Aug 2025 04:04:46 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, Jay Vosburgh <jv@...sburgh.net>,
	Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>,
	Nikolay Aleksandrov <razor@...ckwall.org>,
	Simon Horman <horms@...nel.org>, linux-kernel@...r.kernel.org,
	Liang Li <liali@...hat.com>
Subject: Re: [PATCHv2 net] bonding: fix multicast MAC address synchronization

On Tue, Aug 12, 2025 at 10:42:22AM +0200, Paolo Abeni wrote:
> On 8/5/25 10:09 AM, Hangbin Liu wrote:
> > There is a corner case where the NS (Neighbor Solicitation) target is set to
> > an invalid or unreachable address. In such cases, all the slave links are
> > marked as down and set to *backup*. This causes the bond to add multicast MAC
> > addresses to all slaves. The ARP monitor then cycles through each slave to
> > probe them, temporarily marking as *active*.
> > 
> > Later, if the NS target is changed or cleared during this probe cycle, the
> > *active* slave will fail to remove its NS multicast address because
> > bond_slave_ns_maddrs_del() only removes addresses from backup slaves.
> > This leaves stale multicast MACs on the interface.
> > 
> > To fix this, we move the NS multicast MAC address handling into
> > bond_set_slave_state(), so every slave state transition consistently
> > adds/removes NS multicast addresses as needed.
> > 
> > We also ensure this logic is only active when arp_interval is configured,
> > to prevent misconfiguration or accidental behavior in unsupported modes.
> 
> As noted by Jay in the previous revision, moving the handling into
> bond_set_slave_state() could possibly impact a lot of scenarios, and
> it's not obvious to me that restricting to arp_interval != 0 would be
> sufficient.

I understand your concern. The bond_set_slave_state() function is called by:
  - bond_set_slave_inactive_flags
  - bond_set_slave_tx_disabled_flags
  - bond_set_slave_active_flags

These functions are mainly invoked via bond_change_active_slave, bond_enslave,
bond_ab_arp_commit, and bond_miimon_commit.

To avoid misconfiguration, in slave_can_set_ns_maddr() I tried to limit
changes to the backup slave when operating in active-backup mode with
arp_interval enabled. I also ensured that the multicast address is only
modified when the NS target is set.

> 
> I'm wondering if the issue could/should instead addressed explicitly
> handling the mac swap for the active slave at NS target change time. WDYT?

The problem is that bond_hw_addr_swap() is only called in bond_ab_arp_commit()
during ARP monitoring, while the bond sets active/inactive flags in
bond_ab_arp_probe(). These operations are called partially.

bond_activebackup_arp_mon
 - bond_ab_arp_commit
   - bond_select_active_slave
     - bond_change_active_slave
       - bond_hw_addr_swap
 - bond_ab_arp_probe
   - bond_set_slave_{active/inactive}_flags

On the other hand, we need to set the multicast address on the *temporary*
active interface to ensure we can receive the replied NA message. The MAC
swap only happens when the *actual* active interface is chosen.

This is why I chose to place the multicast address configuration in
bond_set_slave_state().

Thanks
Hangbin