lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <341249BC-4A2E-4C90-A960-BB07FAA9C092@bamaicloud.com>
Date: Fri, 20 Jun 2025 11:15:36 +0800
From: Tonghao Zhang <tonghao@...aicloud.com>
To: carlos.bilbao@...nel.org
Cc: jv@...sburgh.net,
 andrew+netdev@...n.ch,
 davem@...emloft.net,
 edumazet@...gle.com,
 kuba@...nel.org,
 pabeni@...hat.com,
 horms@...nel.org,
 netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org,
 sforshee@...nel.org,
 bilbao@...edu
Subject: Re: [PATCH] bonding: Improve the accuracy of LACPDU transmissions



> 2025年6月19日 03:53,carlos.bilbao@...nel.org 写道:
> 
> From: Carlos Bilbao <carlos.bilbao@...nel.org>
> 
> Improve the timing accuracy of LACPDU transmissions in the bonding 802.3ad
> (LACP) driver. The current approach relies on a decrementing counter to
> limit the transmission rate. In our experience, this method is susceptible
> to delays (such as those caused by CPU contention or soft lockups) which
> can lead to accumulated drift in the LACPDU send interval. Over time, this
> drift can cause synchronization issues with the top-of-rack (ToR) switch
> managing the LAG, manifesting as lag map flapping. This in turn can trigger
> temporary interface removal and potential packet loss.
> 
> This patch improves stability with a jiffies-based mechanism to track and
> enforce the minimum transmission interval; keeping track of when the next
> LACPDU should be sent.
> 
> Suggested-by: Seth Forshee (DigitalOcean) <sforshee@...nel.org>
> Signed-off-by: Carlos Bilbao (DigitalOcean) <carlos.bilbao@...nel.org>
> ---
> drivers/net/bonding/bond_3ad.c | 18 ++++++++----------
> include/net/bond_3ad.h         |  5 +----
> 2 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> index c6807e473ab7..47610697e4e5 100644
> --- a/drivers/net/bonding/bond_3ad.c
> +++ b/drivers/net/bonding/bond_3ad.c
> @@ -1375,10 +1375,12 @@ static void ad_churn_machine(struct port *port)
>  */
> static void ad_tx_machine(struct port *port)
> {
> - /* check if tx timer expired, to verify that we do not send more than
> - * 3 packets per second
> - */
> - if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) {
> + unsigned long now = jiffies;
> +
> + /* Check if enough time has passed since the last LACPDU sent */
> + if (time_after_eq(now, port->sm_tx_next_jiffies)) {
> + port->sm_tx_next_jiffies += ad_ticks_per_sec / AD_MAX_TX_IN_SECOND;
> +
> /* check if there is something to send */
> if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) {
> __update_lacpdu_from_port(port);
> @@ -1395,10 +1397,6 @@ static void ad_tx_machine(struct port *port)
> port->ntt = false;
> }
> }
> - /* restart tx timer(to verify that we will not exceed
> - * AD_MAX_TX_IN_SECOND
> - */
> - port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND;
> }
> }
> 
> @@ -2199,9 +2197,9 @@ void bond_3ad_bind_slave(struct slave *slave)
> /* actor system is the bond's system */
> __ad_actor_update_port(port);
> /* tx timer(to verify that no more than MAX_TX_IN_SECOND
> - * lacpdu's are sent in one second)
> + * lacpdu's are sent in the configured interval (1 or 30 secs))
> */
> - port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND;
> + port->sm_tx_next_jiffies = jiffies + ad_ticks_per_sec / AD_MAX_TX_IN_SECOND;

If CONFIG_HZ is 1000, there is 1000 tick per second, but "ad_ticks_per_sec / AD_MAX_TX_IN_SECOND” == 10/3 == 3, so that means send lacp packets every 3 ticks ?

> 
> __disable_port(port);
> 
> diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h
> index 2053cd8e788a..956d4cb45db1 100644
> --- a/include/net/bond_3ad.h
> +++ b/include/net/bond_3ad.h
> @@ -231,10 +231,7 @@ typedef struct port {
> mux_states_t sm_mux_state; /* state machine mux state */
> u16 sm_mux_timer_counter; /* state machine mux timer counter */
> tx_states_t sm_tx_state; /* state machine tx state */
> - u16 sm_tx_timer_counter; /* state machine tx timer counter
> - * (always on - enter to transmit
> - *  state 3 time per second)
> - */
> + unsigned long sm_tx_next_jiffies;/* expected jiffies for next LACPDU sent */
> u16 sm_churn_actor_timer_counter;
> u16 sm_churn_partner_timer_counter;
> u32 churn_actor_count;
> -- 
> 2.43.0
> 
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ