lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIC5HrE9js_YtSCB@fedora>
Date: Wed, 23 Jul 2025 10:27:42 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Jay Vosburgh <jv@...sburgh.net>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Nikolay Aleksandrov <razor@...ckwall.org>,
	Simon Horman <horms@...nel.org>, Shuah Khan <shuah@...nel.org>,
	linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 1/2] bonding: update ntt to true in passive mode

On Tue, Jul 15, 2025 at 09:19:49PM -0700, Jay Vosburgh wrote:
> 	Presuming that I'm correct that we're not implementing 6.4.1 d),
> above, correctly, then I don't think this is a proper fix, as it kind of
> band-aids over the problem a bit.
> 
> 	Looking at the code, I suspect the problem revolves around the
> "lacp_active" check in ad_periodic_machine():
> 
> static void ad_periodic_machine(struct port *port, struct bond_params *bond_params)
> {
> 	periodic_states_t last_state;
> 
> 	/* keep current state machine state to compare later if it was changed */
> 	last_state = port->sm_periodic_state;
> 
> 	/* check if port was reinitialized */
> 	if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
> 	    (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
> 	    !bond_params->lacp_active) {
> 		port->sm_periodic_state = AD_NO_PERIODIC;
> 	}
> 
> 	In the above, because all the tests are chained with ||, the
> lacp_active test overrides the two correct-looking
> LACP_STATE_LACP_ACTIVITY tests.
> 
> 	It looks like ad_initialize_port() always sets
> LACP_STATE_LACP_ACTIVITY in the port->actor_oper_port_state, and nothing
> ever clears it.
> 
> 	Thinking out loud, perhaps this could be fixed by
> 
> 	a) remove the test of bond_params->lacp_active here, and,
> 
> 	b) The lacp_active option setting controls whether LACP_ACTIVITY
> is set in port->actor_oper_port_state.
> 
> 	Thoughts?

Hi Jay,

I did some investigation and testing. In addition to your previous change,
we also need to initialize the partner's state to 0 in ad_initialize_port_tmpl().
Otherwise, the check:
```
!(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)
```
in ad_periodic_machine() will fail even when the actor is in passive mode.

Also, the line:
```
port->partner_oper.port_state |= LACP_STATE_LACP_ACTIVITY;
```
in ad_rx_machine() should be removed, since we can't assume the partner is in
active mode. [1]

With these two changes, we can ensure:
1. In passive mode, the actor will not send LACPDU before receiving any LACPDU from the partner.
2. Once it receives the partner’s LACPDU, it will start sending periodic LACPDUs as expected.

Do you agree with making these changes? If so, I can post a patch for your review.

[1] IEEE 8021AX-2020, Figure 6-14—LACP Receive state diagram, the AD_RX_EXPIRED
statue should be
```
Partner_Oper_Port_State.Synchronization = FALSE;
Partner_Oper_Port_State.Short_Timeout = TRUE;
Actor_Oper_Port_State.Expired = TRUE;
LACP_currentWhile = Short_Timeout_Time;
```

Thanks,
Hangbin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ