[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1cf6883-a323-40e8-881d-ae7023bbc61a@gmail.com>
Date: Wed, 16 Jul 2025 14:44:04 -0500
From: Carlos Bilbao <carlos.bilbao.osdev@...il.com>
To: Jay Vosburgh <jv@...sburgh.net>, Carlos Bilbao <bilbao@...edu>
Cc: carlos.bilbao@...nel.org, andrew+netdev@...n.ch, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, horms@...nel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org, sforshee@...nel.org
Subject: Re: [PATCH] bonding: Switch periodic LACPDU state machine from
counter to jiffies
Hello Jay,
On 7/16/25 10:30, Jay Vosburgh wrote:
> Carlos Bilbao <bilbao@...edu> wrote:
>
>> FYI, I was able to test this locally but couldn’t find any kselftests to
>> stress the bonding state machine. If anyone knows of additional ways to
>> test it, I’d be happy to run them.
> Your commit message says this change will "help reduce drift
> under contention," but above you say you're unable to stress the state
> machine.
>
> How do you induce "drift under contention" to test that your
> patch actually improves something? What testing has been done to insure
> that the new code doesn't change the behavior in other ways (regressions)?
I tested the bonding driver with and without CPU contention*. With this
patch, the LACPDU state machine is much more consistent under load, with
standard deviation of 0.0065 secs between packets. In comparison, the
current version had a standard deviation of 0.15 secs (~x23 more
variability). I imagine this gets worsens with greater contention.
When I mentioned a possible kselftest (or similar) to "stress" the state
machine, I meant whether there's already any testing that checks the
state machine through different transitions -- e.g., scenarios where the
switch instruct the bond to change configs (for example, between fast and
slow LACP modes), resetting the bond under certain conditions, etc. I just
want to be exhaustive because as you mentioned the state machine has been
around for long time.
*System was stressed using:
stress-ng --cpu $(nproc) --timeout 60
Metrics were collected with:
sudo tcpdump -e -ni <my interface> ether proto 0x8809 and ether src <mac>
>
> Without a specific reproducable bug scenario that this change
> fixes, I'm leery of applying such a refactor to code that has seemingly
> been working fine for 20+ years.
>
> I gather that what this is intending to do is reduce the current
> dependency on the scheduling accuracy of the workqueue event that runs
> the state machines. The current implementation works on a "number of
> invocations" basis, assuming that the event is invoked every 100 msec,
> and computes various timeouts based on the number of times the state
Yep.
> machine runs.
>
> -J
>
>> Thanks!
>>
>> Carlos
>>
>> On 7/15/25 15:57, carlos.bilbao@...nel.org wrote:
>>> From: Carlos Bilbao <carlos.bilbao@...nel.org>
>>>
>>> Replace the bonding periodic state machine for LACPDU transmission of
>>> function ad_periodic_machine() with a jiffies-based mechanism, which is
>>> more accurate and can help reduce drift under contention.
>>>
>>> Signed-off-by: Carlos Bilbao (DigitalOcean) <carlos.bilbao@...nel.org>
>>> ---
>>> drivers/net/bonding/bond_3ad.c | 79 +++++++++++++---------------------
>>> include/net/bond_3ad.h | 2 +-
>>> 2 files changed, 32 insertions(+), 49 deletions(-)
>>>
>>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>>> index c6807e473ab7..8654a51266a3 100644
>>> --- a/drivers/net/bonding/bond_3ad.c
>>> +++ b/drivers/net/bonding/bond_3ad.c
>>> @@ -1421,44 +1421,24 @@ static void ad_periodic_machine(struct port *port, struct bond_params *bond_para
>>> (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY)) ||
>>> !bond_params->lacp_active) {
>>> port->sm_periodic_state = AD_NO_PERIODIC;
>>> - }
>>> - /* check if state machine should change state */
>>> - else if (port->sm_periodic_timer_counter) {
>>> - /* check if periodic state machine expired */
>>> - if (!(--port->sm_periodic_timer_counter)) {
>>> - /* if expired then do tx */
>>> - port->sm_periodic_state = AD_PERIODIC_TX;
>>> - } else {
>>> - /* If not expired, check if there is some new timeout
>>> - * parameter from the partner state
>>> - */
>>> - switch (port->sm_periodic_state) {
>>> - case AD_FAST_PERIODIC:
>>> - if (!(port->partner_oper.port_state
>>> - & LACP_STATE_LACP_TIMEOUT))
>>> - port->sm_periodic_state = AD_SLOW_PERIODIC;
>>> - break;
>>> - case AD_SLOW_PERIODIC:
>>> - if ((port->partner_oper.port_state & LACP_STATE_LACP_TIMEOUT)) {
>>> - port->sm_periodic_timer_counter = 0;
>>> - port->sm_periodic_state = AD_PERIODIC_TX;
>>> - }
>>> - break;
>>> - default:
>>> - break;
>>> - }
>>> - }
>>> + } else if (port->sm_periodic_state == AD_NO_PERIODIC)
>>> + port->sm_periodic_state = AD_FAST_PERIODIC;
>>> + /* check if periodic state machine expired */
>>> + else if (time_after_eq(jiffies, port->sm_periodic_next_jiffies)) {
>>> + /* if expired then do tx */
>>> + port->sm_periodic_state = AD_PERIODIC_TX;
>>> } else {
>>> + /* If not expired, check if there is some new timeout
>>> + * parameter from the partner state
>>> + */
>>> switch (port->sm_periodic_state) {
>>> - case AD_NO_PERIODIC:
>>> - port->sm_periodic_state = AD_FAST_PERIODIC;
>>> - break;
>>> - case AD_PERIODIC_TX:
>>> - if (!(port->partner_oper.port_state &
>>> - LACP_STATE_LACP_TIMEOUT))
>>> + case AD_FAST_PERIODIC:
>>> + if (!(port->partner_oper.port_state & LACP_STATE_LACP_TIMEOUT))
>>> port->sm_periodic_state = AD_SLOW_PERIODIC;
>>> - else
>>> - port->sm_periodic_state = AD_FAST_PERIODIC;
>>> + break;
>>> + case AD_SLOW_PERIODIC:
>>> + if ((port->partner_oper.port_state & LACP_STATE_LACP_TIMEOUT))
>>> + port->sm_periodic_state = AD_PERIODIC_TX;
>>> break;
>>> default:
>>> break;
>>> @@ -1471,21 +1451,24 @@ static void ad_periodic_machine(struct port *port, struct bond_params *bond_para
>>> "Periodic Machine: Port=%d, Last State=%d, Curr State=%d\n",
>>> port->actor_port_number, last_state,
>>> port->sm_periodic_state);
>>> +
>>> switch (port->sm_periodic_state) {
>>> - case AD_NO_PERIODIC:
>>> - port->sm_periodic_timer_counter = 0;
>>> - break;
>>> - case AD_FAST_PERIODIC:
>>> - /* decrement 1 tick we lost in the PERIODIC_TX cycle */
>>> - port->sm_periodic_timer_counter = __ad_timer_to_ticks(AD_PERIODIC_TIMER, (u16)(AD_FAST_PERIODIC_TIME))-1;
>>> - break;
>>> - case AD_SLOW_PERIODIC:
>>> - /* decrement 1 tick we lost in the PERIODIC_TX cycle */
>>> - port->sm_periodic_timer_counter = __ad_timer_to_ticks(AD_PERIODIC_TIMER, (u16)(AD_SLOW_PERIODIC_TIME))-1;
>>> - break;
>>> case AD_PERIODIC_TX:
>>> port->ntt = true;
>>> - break;
>>> + if (!(port->partner_oper.port_state &
>>> + LACP_STATE_LACP_TIMEOUT))
>>> + port->sm_periodic_state = AD_SLOW_PERIODIC;
>>> + else
>>> + port->sm_periodic_state = AD_FAST_PERIODIC;
>>> + fallthrough;
>>> + case AD_SLOW_PERIODIC:
>>> + case AD_FAST_PERIODIC:
>>> + if (port->sm_periodic_state == AD_SLOW_PERIODIC)
>>> + port->sm_periodic_next_jiffies = jiffies
>>> + + HZ * AD_SLOW_PERIODIC_TIME;
>>> + else /* AD_FAST_PERIODIC */
>>> + port->sm_periodic_next_jiffies = jiffies
>>> + + HZ * AD_FAST_PERIODIC_TIME;
>>> default:
>>> break;
>>> }
>>> @@ -1987,7 +1970,7 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
>>> port->sm_rx_state = 0;
>>> port->sm_rx_timer_counter = 0;
>>> port->sm_periodic_state = 0;
>>> - port->sm_periodic_timer_counter = 0;
>>> + port->sm_periodic_next_jiffies = 0;
>>> port->sm_mux_state = 0;
>>> port->sm_mux_timer_counter = 0;
>>> port->sm_tx_state = 0;
>>> diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h
>>> index 2053cd8e788a..aabb8c97caf4 100644
>>> --- a/include/net/bond_3ad.h
>>> +++ b/include/net/bond_3ad.h
>>> @@ -227,7 +227,7 @@ typedef struct port {
>>> rx_states_t sm_rx_state; /* state machine rx state */
>>> u16 sm_rx_timer_counter; /* state machine rx timer counter */
>>> periodic_states_t sm_periodic_state; /* state machine periodic state */
>>> - u16 sm_periodic_timer_counter; /* state machine periodic timer counter */
>>> + unsigned long sm_periodic_next_jiffies; /* state machine periodic next expected sent */
>>> mux_states_t sm_mux_state; /* state machine mux state */
>>> u16 sm_mux_timer_counter; /* state machine mux timer counter */
>>> tx_states_t sm_tx_state; /* state machine tx state */
> ---
> -Jay Vosburgh, jv@...sburgh.net
>
Thanks,
Carlos
Powered by blists - more mailing lists