[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3d55690a-8932-4560-4267-ab28816fdb47@redhat.com>
Date: Tue, 16 Aug 2022 09:41:25 -0400
From: Jonathan Toppins <jtoppins@...hat.com>
To: Jay Vosburgh <jay.vosburgh@...onical.com>
Cc: netdev@...r.kernel.org, liuhangbin@...il.com,
Veaceslav Falico <vfalico@...il.com>,
Andy Gospodarek <andy@...yhouse.net>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net v3 2/2] bonding: 802.3ad: fix no transmission of
LACPDUs
On 8/16/22 09:11, Jay Vosburgh wrote:
> Jonathan Toppins <jtoppins@...hat.com> wrote:
>
>> This is caused by the global variable ad_ticks_per_sec being zero as
>> demonstrated by the reproducer script discussed below. This causes
>> all timer values in __ad_timer_to_ticks to be zero, resulting
>> in the periodic timer to never fire.
>>
>> To reproduce:
>> Run the script in
>> `tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh` which
>> puts bonding into a state where it never transmits LACPDUs.
>>
>> line 44: ip link add fbond type bond mode 4 miimon 200 \
>> xmit_hash_policy 1 ad_actor_sys_prio 65535 lacp_rate fast
>> setting bond param: ad_actor_sys_prio
>> given:
>> params.ad_actor_system = 0
>> call stack:
>> bond_option_ad_actor_sys_prio()
>> -> bond_3ad_update_ad_actor_settings()
>> -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
>> -> ad.system.sys_mac_addr = bond->dev->dev_addr; because
>> params.ad_actor_system == 0
>> results:
>> ad.system.sys_mac_addr = bond->dev->dev_addr
>>
>> line 48: ip link set fbond address 52:54:00:3B:7C:A6
>> setting bond MAC addr
>> call stack:
>> bond->dev->dev_addr = new_mac
>>
>> line 52: ip link set fbond type bond ad_actor_sys_prio 65535
>> setting bond param: ad_actor_sys_prio
>> given:
>> params.ad_actor_system = 0
>> call stack:
>> bond_option_ad_actor_sys_prio()
>> -> bond_3ad_update_ad_actor_settings()
>> -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
>> -> ad.system.sys_mac_addr = bond->dev->dev_addr; because
>> params.ad_actor_system == 0
>> results:
>> ad.system.sys_mac_addr = bond->dev->dev_addr
>>
>> line 60: ip link set veth1-bond down master fbond
>> given:
>> params.ad_actor_system = 0
>> params.mode = BOND_MODE_8023AD
>> ad.system.sys_mac_addr == bond->dev->dev_addr
>> call stack:
>> bond_enslave
>> -> bond_3ad_initialize(); because first slave
>> -> if ad.system.sys_mac_addr != bond->dev->dev_addr
>> return
>> results:
>> Nothing is run in bond_3ad_initialize() because dev_add equals
>> sys_mac_addr leaving the global ad_ticks_per_sec zero as it is
>> never initialized anywhere else.
>>
>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>> Signed-off-by: Jonathan Toppins <jtoppins@...hat.com>
>> ---
>>
>> Notes:
>> v2:
>> * split this fix from the reproducer
>> v3:
>> * rebased to latest net/master
>>
>> drivers/net/bonding/bond_3ad.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index d7fb33c078e8..957d30db6f95 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -84,7 +84,8 @@ enum ad_link_speed_type {
>> static const u8 null_mac_addr[ETH_ALEN + 2] __long_aligned = {
>> 0, 0, 0, 0, 0, 0
>> };
>> -static u16 ad_ticks_per_sec;
>> +
>> +static u16 ad_ticks_per_sec = 1000 / AD_TIMER_INTERVAL;
>> static const int ad_delta_in_ticks = (AD_TIMER_INTERVAL * HZ) / 1000;
>
> I still feel like this is kind of a hack, as it's not really
> fixing bond_3ad_initialize to actually work (which is the real problem
> as I understand it). If it's ok to skip all that for this case, then
> why do we ever need to call bond_3ad_initialize?
>
The way it is currently written you still need to call
bond_3ad_initialize() just not for setting the tick resolution. The
issue here is ad_ticks_per_sec is used in several places to calculate
timer periods, __ad_timer_to_ticks(), for various timers in the 802.3ad
protocol. And if this variable, ad_ticks_per_sec, is left uninitialized
all of these timer periods go to zero. Since the value passed in
bond_3ad_initialize() is an immediate value I simply moved it off of the
call stack and set the static global variable instead.
To fix bond_3ad_initialize(), probably something like the below is
needed, but I do not understand why the guard if check was placed in
bond_3ad_initialize().
diff --git i/drivers/net/bonding/bond_3ad.c w/drivers/net/bonding/bond_3ad.c
index d7fb33c078e8..5b5146f5c4ea 100644
--- i/drivers/net/bonding/bond_3ad.c
+++ w/drivers/net/bonding/bond_3ad.c
@@ -2005,32 +2005,21 @@ void bond_3ad_initiate_agg_selection(struct
bonding *bond, int timeout)
*
* Can be called only after the mac address of the bond is set.
*/
-void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution)
+void bond_3ad_initialize(struct bonding *bond)
{
- /* check that the bond is not initialized yet */
- if (!MAC_ADDRESS_EQUAL(&(BOND_AD_INFO(bond).system.sys_mac_addr),
- bond->dev->dev_addr)) {
-
- BOND_AD_INFO(bond).aggregator_identifier = 0;
-
- BOND_AD_INFO(bond).system.sys_priority =
- bond->params.ad_actor_sys_prio;
- if (is_zero_ether_addr(bond->params.ad_actor_system))
- BOND_AD_INFO(bond).system.sys_mac_addr =
- *((struct mac_addr *)bond->dev->dev_addr);
- else
- BOND_AD_INFO(bond).system.sys_mac_addr =
- *((struct mac_addr *)bond->params.ad_actor_system);
-
- /* initialize how many times this module is called in one
- * second (should be about every 100ms)
- */
- ad_ticks_per_sec = tick_resolution;
+ BOND_AD_INFO(bond).aggregator_identifier = 0;
+ BOND_AD_INFO(bond).system.sys_priority =
+ bond->params.ad_actor_sys_prio;
+ if (is_zero_ether_addr(bond->params.ad_actor_system))
+ BOND_AD_INFO(bond).system.sys_mac_addr =
+ *((struct mac_addr *)bond->dev->dev_addr);
+ else
+ BOND_AD_INFO(bond).system.sys_mac_addr =
+ *((struct mac_addr *)bond->params.ad_actor_system);
- bond_3ad_initiate_agg_selection(bond,
- AD_AGGREGATOR_SELECTION_TIMER *
- ad_ticks_per_sec);
- }
+ bond_3ad_initiate_agg_selection(bond,
+ AD_AGGREGATOR_SELECTION_TIMER *
+ ad_ticks_per_sec);
}
/**
diff --git i/drivers/net/bonding/bond_main.c
w/drivers/net/bonding/bond_main.c
index 50e60843020c..5f56af9dc3ba 100644
--- i/drivers/net/bonding/bond_main.c
+++ w/drivers/net/bonding/bond_main.c
@@ -2078,10 +2078,10 @@ int bond_enslave(struct net_device *bond_dev,
struct net_device *slave_dev,
/* if this is the first slave */
if (!prev_slave) {
SLAVE_AD_INFO(new_slave)->id = 1;
- /* Initialize AD with the number of times that the AD timer is
called in 1 second
- * can be called only after the mac address of the bond is set
+ /* can be called only after the mac address of the
+ * bond is set
*/
- bond_3ad_initialize(bond, 1000/AD_TIMER_INTERVAL);
+ bond_3ad_initialize(bond);
} else {
SLAVE_AD_INFO(new_slave)->id =
SLAVE_AD_INFO(prev_slave)->id + 1;
diff --git i/include/net/bond_3ad.h w/include/net/bond_3ad.h
index 184105d68294..be2992e6de5d 100644
--- i/include/net/bond_3ad.h
+++ w/include/net/bond_3ad.h
@@ -290,7 +290,7 @@ static inline const char
*bond_3ad_churn_desc(churn_state_t state)
}
/* ========== AD Exported functions to the main bonding code ========== */
-void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution);
+void bond_3ad_initialize(struct bonding *bond);
void bond_3ad_bind_slave(struct slave *slave);
void bond_3ad_unbind_slave(struct slave *slave);
void bond_3ad_state_machine_handler(struct work_struct *);
-Jon
Powered by blists - more mailing lists