lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1E373854-9996-49E6-8609-194CEAFA29ED@bamaicloud.com>
Date: Tue, 14 Oct 2025 20:52:19 +0800
From: Tonghao Zhang <tonghao@...aicloud.com>
To: Jiri Slaby <jirislaby@...nel.org>
Cc: netdev@...r.kernel.org,
 Jay Vosburgh <jv@...sburgh.net>,
 "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>,
 Simon Horman <horms@...nel.org>,
 Jonathan Corbet <corbet@....net>,
 Andrew Lunn <andrew+netdev@...n.ch>,
 Steven Rostedt <rostedt@...dmis.org>,
 Masami Hiramatsu <mhiramat@...nel.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Nikolay Aleksandrov <razor@...ckwall.org>,
 Zengbing Tu <tuzengbing@...iglobal.com>
Subject: Re: [net-next v8 1/3] net: bonding: add broadcast_neighbor option for
 802.3ad



> On Oct 14, 2025, at 17:12, Jiri Slaby <jirislaby@...nel.org> wrote:
> 
> On 27. 06. 25, 15:49, Tonghao Zhang wrote:
>> Stacking technology is a type of technology used to expand ports on
>> Ethernet switches. It is widely used as a common access method in
>> large-scale Internet data center architectures. Years of practice
>> have proved that stacking technology has advantages and disadvantages
>> in high-reliability network architecture scenarios. For instance,
>> in stacking networking arch, conventional switch system upgrades
>> require multiple stacked devices to restart at the same time.
>> Therefore, it is inevitable that the business will be interrupted
>> for a while. It is for this reason that "no-stacking" in data centers
>> has become a trend. Additionally, when the stacking link connecting
>> the switches fails or is abnormal, the stack will split. Although it is
>> not common, it still happens in actual operation. The problem is that
>> after the split, it is equivalent to two switches with the same
>> configuration appearing in the network, causing network configuration
>> conflicts and ultimately interrupting the services carried by the
>> stacking system.
>> To improve network stability, "non-stacking" solutions have been
>> increasingly adopted, particularly by public cloud providers and
>> tech companies like Alibaba, Tencent, and Didi. "non-stacking" is
>> a method of mimicing switch stacking that convinces a LACP peer,
>> bonding in this case, connected to a set of "non-stacked" switches
>> that all of its ports are connected to a single switch
>> (i.e., LACP aggregator), as if those switches were stacked. This
>> enables the LACP peer's ports to aggregate together, and requires
>> (a) special switch configuration, described in the linked article,
>> and (b) modifications to the bonding 802.3ad (LACP) mode to send
>> all ARP/ND packets across all ports of the active aggregator.
>> Note that, with multiple aggregators, the current broadcast mode
>> logic will send only packets to the selected aggregator(s).
>>  +-----------+   +-----------+
>>  |  switch1  |   |  switch2  |
>>  +-----------+   +-----------+
>>          ^           ^
>>          |           |
>>       +-----------------+
>>       |   bond4 lacp    |
>>       +-----------------+
>>          |           |
>>          | NIC1      | NIC2
>>       +-----------------+
>>       |     server      |
>>       +-----------------+
> 
> Hi,
> 
> this breaks broadcast bonding in 6.17. Reverting these three (the two depend on this one) makes 6.17 work again:
> 2f9afffc399d net: bonding: send peer notify when failure recovery
> 3d98ee52659c net: bonding: add broadcast_neighbor netlink option
> ce7a381697cb net: bonding: add broadcast_neighbor option for 802.3ad
> 
> This was reported downstream as an error in our openQA:
> https://bugzilla.suse.com/show_bug.cgi?id=1250894
> 
> I bisected using this in qemu:
> systemctl stop network
> ip link del bond0 || true
> ip link set dev eth0 down
> ip addr flush eth0
> ip link add bond0 type bond mode broadcast
> ip link set dev eth0 master bond0
> ip addr add 10.0.2.15/24 dev bond0
> ip link set bond0 up
> sleep 1
> exec nmap -sS 10.0.2.2/32
> 
> Any ideas?
> 
>> - https://www.ruijie.com/fr-fr/support/tech-gallery/de-stack-data-center-network-architecture/
>> Cc: Jay Vosburgh <jv@...sburgh.net>
>> Cc: "David S. Miller" <davem@...emloft.net>
>> Cc: Eric Dumazet <edumazet@...gle.com>
>> Cc: Jakub Kicinski <kuba@...nel.org>
>> Cc: Paolo Abeni <pabeni@...hat.com>
>> Cc: Simon Horman <horms@...nel.org>
>> Cc: Jonathan Corbet <corbet@....net>
>> Cc: Andrew Lunn <andrew+netdev@...n.ch>
>> Cc: Steven Rostedt <rostedt@...dmis.org>
>> Cc: Masami Hiramatsu <mhiramat@...nel.org>
>> Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>> Cc: Nikolay Aleksandrov <razor@...ckwall.org>
>> Signed-off-by: Tonghao Zhang <tonghao@...aicloud.com>
>> Signed-off-by: Zengbing Tu <tuzengbing@...iglobal.com>
>> ---
>> v8: add comments info in bond_option_mode_set, explain why we only
>> clear broadcast_neighbor to 0.
>> Note that selftest will be post after I post the iproute2 patch about
>> this option.
>> ---
>>  Documentation/networking/bonding.rst |  6 +++
>>  drivers/net/bonding/bond_main.c      | 66 +++++++++++++++++++++++++---
>>  drivers/net/bonding/bond_options.c   | 42 ++++++++++++++++++
>>  include/net/bond_options.h           |  1 +
>>  include/net/bonding.h                |  3 ++
>>  5 files changed, 112 insertions(+), 6 deletions(-)
> ...
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
> ...
>> @@ -5329,17 +5369,27 @@ static netdev_tx_t bond_3ad_xor_xmit(struct sk_buff *skb,
>>   return bond_tx_drop(dev, skb);
>>  }
>>  -/* in broadcast mode, we send everything to all usable interfaces. */
>> +/* in broadcast mode, we send everything to all or usable slave interfaces.
>> + * under rcu_read_lock when this function is called.
>> + */
>>  static netdev_tx_t bond_xmit_broadcast(struct sk_buff *skb,
>> -       struct net_device *bond_dev)
>> +       struct net_device *bond_dev,
>> +       bool all_slaves)
>>  {
>>   struct bonding *bond = netdev_priv(bond_dev);
>> - struct slave *slave = NULL;
>> - struct list_head *iter;
>> + struct bond_up_slave *slaves;
>>   bool xmit_suc = false;
>>   bool skb_used = false;
>> + int slaves_count, i;
>>  - bond_for_each_slave_rcu(bond, slave, iter) {
>> + if (all_slaves)
>> + slaves = rcu_dereference(bond->all_slaves);
>> + else
>> + slaves = rcu_dereference(bond->usable_slaves);
>> +
>> + slaves_count = slaves ? READ_ONCE(slaves->count) : 0;
> 
> OK, slaves_count is now 0 (slaves and bond->all_slaves are NULL), but bond_for_each_slave_rcu() used to yield 1 iface.
> 
> Well, bond_update_slave_arr() is not called for broadcast AFAICS.
Thank you for pointing out this issue. We don't need to revert the patch. can you test if the following patch is useful to you. I will add test cases later.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index a8034a561011..c950e1e7f284 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2384,7 +2384,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
                unblock_netpoll_tx();
        }

-       if (bond_mode_can_use_xmit_hash(bond))
+       if (bond_mode_can_use_xmit_hash(bond) ||
+           BOND_MODE(bond) == BOND_MODE_BROADCAST)
                bond_update_slave_arr(bond, NULL);

        if (!slave_dev->netdev_ops->ndo_bpf ||
@@ -2560,7 +2561,8 @@ static int __bond_release_one(struct net_device *bond_dev,

        bond_upper_dev_unlink(bond, slave);

-       if (bond_mode_can_use_xmit_hash(bond))
+       if (bond_mode_can_use_xmit_hash(bond) ||
+           BOND_MODE(bond) == BOND_MODE_BROADCAST)
                bond_update_slave_arr(bond, slave);

        slave_info(bond_dev, slave_dev, "Releasing %s interface\n",
> 
>> + for (i = 0; i < slaves_count; i++) {
>> + struct slave *slave = slaves->arr[i];
>>   struct sk_buff *skb2;
>>     if (!(bond_slave_is_up(slave) && slave->link == BOND_LINK_UP))
>> @@ -5577,10 +5627,13 @@ static netdev_tx_t __bond_start_xmit(struct sk_buff *skb, struct net_device *dev
>>   case BOND_MODE_ACTIVEBACKUP:
>>   return bond_xmit_activebackup(skb, dev);
>>   case BOND_MODE_8023AD:
>> + if (bond_should_broadcast_neighbor(skb, dev))
>> + return bond_xmit_broadcast(skb, dev, false);
>> + fallthrough;
>>   case BOND_MODE_XOR:
>>   return bond_3ad_xor_xmit(skb, dev);
>>   case BOND_MODE_BROADCAST:
>> - return bond_xmit_broadcast(skb, dev);
>> + return bond_xmit_broadcast(skb, dev, true);
>>   case BOND_MODE_ALB:
>>   return bond_alb_xmit(skb, dev);
>>   case BOND_MODE_TLB:
>> @@ -6456,6 +6509,7 @@ static int __init bond_check_params(struct bond_params *params)
>>   eth_zero_addr(params->ad_actor_system);
>>   params->ad_user_port_key = ad_user_port_key;
>>   params->coupled_control = 1;
>> + params->broadcast_neighbor = 0;
>>   if (packets_per_slave > 0) {
>>   params->reciprocal_packets_per_slave =
>>   reciprocal_value(packets_per_slave);
> 
> -- 
> js
> suse labs
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ