[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIC5HrE9js_YtSCB@fedora>
Date: Wed, 23 Jul 2025 10:27:42 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Jay Vosburgh <jv@...sburgh.net>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Simon Horman <horms@...nel.org>, Shuah Khan <shuah@...nel.org>,
linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 1/2] bonding: update ntt to true in passive mode
On Tue, Jul 15, 2025 at 09:19:49PM -0700, Jay Vosburgh wrote:
> Presuming that I'm correct that we're not implementing 6.4.1 d),
> above, correctly, then I don't think this is a proper fix, as it kind of
> band-aids over the problem a bit.
>
> Looking at the code, I suspect the problem revolves around the
> "lacp_active" check in ad_periodic_machine():
>
> static void ad_periodic_machine(struct port *port, struct bond_params *bond_params)
> {
> periodic_states_t last_state;
>
> /* keep current state machine state to compare later if it was changed */
> last_state = port->sm_periodic_state;
>
> /* check if port was reinitialized */
> if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
> (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
> !bond_params->lacp_active) {
> port->sm_periodic_state = AD_NO_PERIODIC;
> }
>
> In the above, because all the tests are chained with ||, the
> lacp_active test overrides the two correct-looking
> LACP_STATE_LACP_ACTIVITY tests.
>
> It looks like ad_initialize_port() always sets
> LACP_STATE_LACP_ACTIVITY in the port->actor_oper_port_state, and nothing
> ever clears it.
>
> Thinking out loud, perhaps this could be fixed by
>
> a) remove the test of bond_params->lacp_active here, and,
>
> b) The lacp_active option setting controls whether LACP_ACTIVITY
> is set in port->actor_oper_port_state.
>
> Thoughts?
Hi Jay,
I did some investigation and testing. In addition to your previous change,
we also need to initialize the partner's state to 0 in ad_initialize_port_tmpl().
Otherwise, the check:
```
!(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)
```
in ad_periodic_machine() will fail even when the actor is in passive mode.
Also, the line:
```
port->partner_oper.port_state |= LACP_STATE_LACP_ACTIVITY;
```
in ad_rx_machine() should be removed, since we can't assume the partner is in
active mode. [1]
With these two changes, we can ensure:
1. In passive mode, the actor will not send LACPDU before receiving any LACPDU from the partner.
2. Once it receives the partner’s LACPDU, it will start sending periodic LACPDUs as expected.
Do you agree with making these changes? If so, I can post a patch for your review.
[1] IEEE 8021AX-2020, Figure 6-14—LACP Receive state diagram, the AD_RX_EXPIRED
statue should be
```
Partner_Oper_Port_State.Synchronization = FALSE;
Partner_Oper_Port_State.Short_Timeout = TRUE;
Actor_Oper_Port_State.Expired = TRUE;
LACP_currentWhile = Short_Timeout_Time;
```
Thanks,
Hangbin
Powered by blists - more mailing lists