netdev - Re: [PATCH net] bonding: 802.3ad: Avoid packet loss when switching aggregator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20240415185720.399e054f@samweis>
Date: Mon, 15 Apr 2024 18:57:20 +0200
From: Thomas Bogendoerfer <tbogendoerfer@...e.de>
To: Jay Vosburgh <jay.vosburgh@...onical.com>
Cc: Andy Gospodarek <andy@...yhouse.net>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
 <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] bonding: 802.3ad: Avoid packet loss when switching
 aggregator

On Wed, 10 Apr 2024 17:28:29 -0700
Jay Vosburgh <jay.vosburgh@...onical.com> wrote:

> 	First, I'm not sure why your port is in WAITING state, unless
> it's simply that your test is happening very quickly after the port is
> added to the bond.  The standard (IEEE 802.1AX-2014 6.4.15) requires
> ports to remain in WAITING state for 2 seconds when transitioning from
> DETACHED to ATTACHED state (to limit thrashing when multiple ports are
> added in a short span of time).
> 
> 	You mention the issue happens when the aggregator changes; do
> you have a detailed sequence of events that describe how the issue is
> induced?

setup is one Linux server with 2 dual port ethernet cards connected to
a HP 5710 Flexfabric switch with two modules. Using MC-LAG is probably the
key to trigger the issue, at least I couldn't reproduce without it.

1. create bond0 
2. enslave 4 ports to it
3. wait for link up
4. do duplicate address detection

most of the time this works without problems, but in the error case
DAD fails with an ENOBUFS for the send call to the packet socket,
which correlates with the tx dropped in the bond statistic counters.

I've enabled debug print for the ad_agg_selection_logic() and in
error case the look like this:

[ 4488.603417] bond0: (slave eth6): best Agg=1; P=1; a k=0; p k=1; Ind=1; Act=0
[ 4488.603428] bond0: (slave eth6): best ports 0000000019ca9537 slave 00000000ee0c58b9
[ 4488.603433] bond0: (slave eth6): Agg=1; P=1; a k=0; p k=1; Ind=1; Act=0
[ 4488.603437] bond0: (slave eth7): Agg=2; P=0; a k=0; p k=0; Ind=0; Act=0
[ 4488.603441] bond0: (slave eth8): Agg=3; P=0; a k=0; p k=0; Ind=0; Act=0
[ 4488.603444] bond0: (slave eth9): Agg=4; P=0; a k=0; p k=0; Ind=0; Act=0
[ 4488.603447] bond0: Warning: No 802.3ad response from the link partner for any adapters in the bond
[ 4488.603449] bond0: (slave eth6): LAG 1 chosen as the active LAG
[ 4488.603452] bond0: (slave eth6): Agg=1; P=1; a k=0; p k=1; Ind=1; Act=1
[ 4488.610481] 8021q: adding VLAN 0 to HW filter on device bond0
[ 4488.618756] bond0: (slave eth6): link status definitely up, 10000 Mbps full duplex
[ 4488.618795] bond0: (slave eth7): link status definitely up, 10000 Mbps full duplex
[ 4488.618831] bond0: (slave eth8): link status definitely up, 10000 Mbps full duplex
[ 4488.618836] bond0: active interface up!
[ 4488.678822] ixgbe 0000:81:00.1 eth9: detected SFP+: 6
[ 4488.706715] bond0: (slave eth6): best Agg=1; P=1; a k=15; p k=1; Ind=0; Act=0
[ 4488.706726] bond0: (slave eth6): best ports 0000000019ca9537 slave 00000000ee0c58b9
[ 4488.706732] bond0: (slave eth6): Agg=1; P=1; a k=15; p k=1; Ind=0; Act=0
[ 4488.706737] bond0: (slave eth7): Agg=2; P=1; a k=0; p k=1; Ind=1; Act=0
[ 4488.706740] bond0: (slave eth8): Agg=3; P=1; a k=0; p k=1; Ind=1; Act=0
[ 4488.706744] bond0: (slave eth9): Agg=4; P=1; a k=0; p k=1; Ind=1; Act=0
[ 4488.706747] bond0: (slave eth6): LAG 1 chosen as the active LAG
[ 4488.706750] bond0: (slave eth6): Agg=1; P=1; a k=15; p k=1; Ind=0; Act=1
[ 4488.814731] ixgbe 0000:81:00.1 eth9: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 4488.826760] bond0: (slave eth9): link status definitely up, 10000 Mbps full duplex
[ 4488.914672] bond0: (slave eth7): best Agg=2; P=1; a k=15; p k=1; Ind=0; Act=0
[ 4488.914682] bond0: (slave eth7): best ports 00000000413bcc63 slave 00000000931f59f6
[ 4488.914687] bond0: (slave eth6): Agg=1; P=1; a k=15; p k=1; Ind=0; Act=0
[ 4488.914692] bond0: (slave eth7): Agg=2; P=1; a k=15; p k=1; Ind=0; Act=0
[ 4488.914695] bond0: (slave eth8): Agg=3; P=1; a k=15; p k=1; Ind=0; Act=0
[ 4488.914698] bond0: (slave eth9): Agg=4; P=1; a k=0; p k=1; Ind=1; Act=0
[ 4488.914701] bond0: (slave eth7): LAG 2 chosen as the active LAG
[ 4488.914704] bond0: (slave eth7): Agg=2; P=1; a k=15; p k=1; Ind=0; Act=1

I've added a debug statement to find out why Agg 2 is better than Agg 1 in
this case and it's because Agg 2 has a partner (__agg_has_partner() is true)
while Agg 1 doesn't.

Wouldn't it make sense to also check for slaves in COLLECTING|DISTRIBUTING
state before switching to a new aggregator ?

Thomas.

-- 
SUSE Software Solutions Germany GmbH
HRB 36809 (AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew McDonald, Werner Knoblich