lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 12 Sep 2014 00:44:58 +0200 From: Nikolay Aleksandrov <nikolay@...hat.com> To: Mahesh Bandewar <maheshb@...gle.com> CC: Jay Vosburgh <j.vosburgh@...il.com>, Veaceslav Falico <vfalico@...hat.com>, Andy Gospodarek <andy@...yhouse.net>, David Miller <davem@...emloft.net>, netdev <netdev@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>, Maciej Zenczykowski <maze@...gle.com> Subject: Re: [PATCH net-next v3 2/2] bonding: Simplify the xmit function for modes that use xmit_hash On 09/12/2014 12:08 AM, Mahesh Bandewar wrote: > some how my earlier mail bounced back (formatting issues, I suppose!). > So it's a resend. > > On Thu, Sep 11, 2014 at 2:27 PM, Mahesh Bandewar <maheshb@...gle.com> wrote: >> >> On Thu, Sep 11, 2014 at 2:39 AM, Nikolay Aleksandrov <nikolay@...hat.com> wrote: >>> On 11/09/14 06:16, Mahesh Bandewar wrote: >>>> >>>> Earlier change to use usable slave array for TLB mode had an additional >>>> performance advantage. So extending the same logic to all other modes >>>> that use xmit-hash for slave selection (viz 802.3AD, and XOR modes). >>>> Also consolidating this with the earlier TLB change. >>>> >>>> The main idea is to build the usable slaves array in the control path >>>> and use that array for slave selection during xmit operation. >>>> >>>> Measured performance in a setup with a bond of 4x1G NICs with 200 >>>> instances of netperf for the modes involved (3ad, xor, tlb) >>>> cmd: netperf -t TCP_RR -H <TargetHost> -l 60 -s 5 >>>> >>>> Mode TPS-Before TPS-After >>>> >>>> 802.3ad : 468,694 493,101 >>>> TLB (lb=0): 392,583 392,965 >>>> XOR : 475,696 484,517 >>>> >>>> Signed-off-by: Mahesh Bandewar <maheshb@...gle.com> >>>> --- >>>> v1: >>>> (a) If bond_update_slave_arr() fails to allocate memory, it will >>>> overwrite >>>> the slave that need to be removed. >>>> (b) Freeing of array will assign NULL (to handle bond->down to bond->up >>>> transition gracefully. >>>> (c) Change from pr_debug() to pr_err() if bond_update_slave_arr() >>>> returns >>>> failure. >>>> (d) XOR: bond_update_slave_arr() will consider mii-mon, arp-mon cases >>>> and >>>> will populate the array even if these parameters are not used. >>>> (e) 3AD: Should handle the ad_agg_selection_logic correctly. >>>> v2: >>>> (a) Removed rcu_read_{un}lock() calls from array manipulation code. >>>> (b) Slave link-events now refresh array for all these modes. >>>> (c) Moved free-array call from bond_close() to bond_uninit(). >>>> v3: >>>> (a) Fixed null pointer dereference. >>>> (b) Removed bond->lock lockdep dependency. >>>> >>> Hello Mahesh, >>> You should've given me time to respond, the reason I wrote this: >>> "First a question, if a bond device in XOR mode is up and we enslave a >>> single >>> slave how would it start transmitting ? Same question, if we are enslaving >>> a >>> second device then the array will be rebuild with only the first upon >>> NETDEV_UP >>> (of course all this is in the case miimon is 0). >>> The NETDEV_UP upon enslave happens before the slave is linked in." >>> was not because I wanted you to remove the slave rebuilding from the >>> NETDEV_UP/DOWN events, but because I didn't see how would a slave start >>> transmitting in XOR mode after enslaving, and I just tested it - it doesn't >>> since the slave array never gets rebuilt. The NETDEV_UP event is carried by >>> the dev_open() done in bond_enslave() earlier so the bond_set_carrier() in >>> the end isn't of much importance in most cases, simply do the following and >>> you'll see: >>> modprobe bonding mode=2 >>> ip set bond0 up >>> ifenslave bond0 eth0 >>> >>> Try to transmit anything and watch on the other side, you won't be able to >>> see anything as there's no slave array. My second question was given all >>> this, if you enslave any subsequent slaves, will it start transmitting ? But >>> I just tested this scenario and it still doesn't as the array doesn't get >>> built. >> > OK I tried that and I don't see what you have observed - > > Here is the log - > > ~# modprobe bonding mode=2 > ~# ip link set bond0 up > [ 133.537516] New slave count=0 > ~# [add ip addr] > ~# [add default route] > ~# ifenslave bond0 eth0 > ~# [ 211.044852] Adding slave=eth0 > [ 211.047826] New slave count=1 > ~# ifenslave bond0 eth1 > [ 723.795877] Adding slave=eth0 > [ 723.798853] Adding slave=eth1 > [ 723.801824] New slave count=2 > > I have added some instrumentation to see what is happening and > following are the couple of changes (basically printfs!) - > > > @@ -3730,13 +3736,16 @@ int bond_update_slave_arr(struct bonding > *bond, struct slave *skipslave) > continue; > if (skipslave == slave) > continue; > +pr_err("Adding slave=%s\n", slave->dev->name); > new_arr->arr[new_arr->count++] = slave; > } > > old_arr = rcu_dereference_protected(bond->slave_arr, > lockdep_rtnl_is_held() || > > lockdep_is_held(&bond->curr_slave_lock)); > rcu_assign_pointer(bond->slave_arr, new_arr); > +pr_err("New slave count=%d\n", new_arr->count); > if (old_arr) > kfree_rcu(old_arr, rcu); > > > Something is wrong here either you have some modified version or something else is going on because with your patch + the pr_err()s in bond_update_slave_arr() I get the following: # modprobe bonding mode=2 # ip l set bond0 up [ 47.905891] Slave count: 0 # ip addr add 192.168.160.2/24 dev bond0 # ifenslave bond0 eth1 [ 78.056764] bond0: Enslaving eth1 as an active interface with an up link [ 78.056789] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready *(no slave array messages)* # ping ... nothing. Doing a down/up cycle on the bond works as it should: # ip l set bond0 down # ip l set bond0 up [ 686.786898] Adding slave: eth1 [ 686.787153] Adding slave: eth2 [ 686.787329] Slave count: 2 And now it begins to work. Same happens if I enslave a subsequent slave, I really don't see how your arrays get rebuilt, do you have a different patch which does array rebuilding in bond_enslave() ? The NETDEV_UP for the slave is generated by dev_open() and it's too early to be caught as a slave by the bond's notifier. I'd be curious to see the stack trace on your slave rebuilding after the first enslave, could you insert a WARN_ON(1) in the bond_update_slave_arr() for example and see what's triggering the slave rebuild ? In contrast the same module without your patches: # modprobe bonding mode=2 # ip l set bond0 up # ip addr add 192.168.160.2/24 dev bond0 # ifenslave bond0 eth1 # ping ... works. I'm doing these tests in a VM with virtio_net devices. Also my default miimon is 0, if that matters. My tree looks like: 9afec9969e3b15a8d52acb07df48423c2f941e50 bonding: Simplify the xmit function for modes that use xmit_hash d876c8ca5d46c3b4c201289f48011350efa545ce bonding: display xmit_hash_policy for non-dynamic-tlb mode 9b5a8a12737ee9ff100e87ab7fdfdec4d0999f4e net: bpf: only build bpf_jit_binary_{alloc, free}() when jit selected >>> A few suggestions below, nothing serious though. >>> >>> >>>> drivers/net/bonding/bond_3ad.c | 76 +++----------------- >>>> drivers/net/bonding/bond_alb.c | 51 ++------------ >>>> drivers/net/bonding/bond_alb.h | 8 --- >>>> drivers/net/bonding/bond_main.c | 152 >>>> ++++++++++++++++++++++++++++++++++++---- >>>> drivers/net/bonding/bonding.h | 8 +++ >>>> 5 files changed, 164 insertions(+), 131 deletions(-) >>>> >>>> diff --git a/drivers/net/bonding/bond_3ad.c >>>> b/drivers/net/bonding/bond_3ad.c >>>> index 5d27a6207384..516075f0a740 100644 >>>> --- a/drivers/net/bonding/bond_3ad.c >>>> +++ b/drivers/net/bonding/bond_3ad.c >>>> @@ -1579,6 +1579,8 @@ static void ad_agg_selection_logic(struct aggregator >>>> *agg) >>>> __disable_port(port); >>>> } >>>> } >>>> + if (bond_update_slave_arr(bond, NULL)) >>>> + pr_err("Failed to build slave-array for 3ad >>>> mode.\n"); >>>> } >>>> >>>> /* if the selected aggregator is of join individuals >>>> @@ -1717,6 +1719,8 @@ static void ad_enable_collecting_distributing(struct >>>> port *port) >>>> port->actor_port_number, >>>> port->aggregator->aggregator_identifier); >>>> __enable_port(port); >>>> + if (bond_update_slave_arr(port->slave->bond, NULL)) >>>> + pr_err("Failed to build slave-array for 3ad >>>> mode.\n"); >>>> } >>>> } >>>> >>>> @@ -1733,6 +1737,8 @@ static void >>>> ad_disable_collecting_distributing(struct port *port) >>>> port->actor_port_number, >>>> port->aggregator->aggregator_identifier); >>>> __disable_port(port); >>>> + if (bond_update_slave_arr(port->slave->bond, NULL)) >>>> + pr_err("Failed to build slave-array for 3ad >>>> mode.\n"); >>>> } >>>> } >>>> >>>> @@ -2311,6 +2317,9 @@ void bond_3ad_handle_link_change(struct slave >>>> *slave, char link) >>>> */ >>>> port->sm_vars |= AD_PORT_BEGIN; >>>> >>>> + if (bond_update_slave_arr(slave->bond, NULL)) >>>> + pr_err("Failed to build slave-array for 3ad mode.\n"); >>>> + >>>> __release_state_machine_lock(port); >>>> } >>>> >>>> @@ -2406,73 +2415,6 @@ int bond_3ad_get_active_agg_info(struct bonding >>>> *bond, struct ad_info *ad_info) >>>> return ret; >>>> } >>>> >>>> -int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) >>>> -{ >>>> - struct bonding *bond = netdev_priv(dev); >>>> - struct slave *slave, *first_ok_slave; >>>> - struct aggregator *agg; >>>> - struct ad_info ad_info; >>>> - struct list_head *iter; >>>> - int slaves_in_agg; >>>> - int slave_agg_no; >>>> - int agg_id; >>>> - >>>> - if (__bond_3ad_get_active_agg_info(bond, &ad_info)) { >>>> - netdev_dbg(dev, "__bond_3ad_get_active_agg_info >>>> failed\n"); >>>> - goto err_free; >>>> - } >>>> - >>>> - slaves_in_agg = ad_info.ports; >>>> - agg_id = ad_info.aggregator_id; >>>> - >>>> - if (slaves_in_agg == 0) { >>>> - netdev_dbg(dev, "active aggregator is empty\n"); >>>> - goto err_free; >>>> - } >>>> - >>>> - slave_agg_no = bond_xmit_hash(bond, skb) % slaves_in_agg; >>>> - first_ok_slave = NULL; >>>> - >>>> - bond_for_each_slave_rcu(bond, slave, iter) { >>>> - agg = SLAVE_AD_INFO(slave)->port.aggregator; >>>> - if (!agg || agg->aggregator_identifier != agg_id) >>>> - continue; >>>> - >>>> - if (slave_agg_no >= 0) { >>>> - if (!first_ok_slave && bond_slave_can_tx(slave)) >>>> - first_ok_slave = slave; >>>> - slave_agg_no--; >>>> - continue; >>>> - } >>>> - >>>> - if (bond_slave_can_tx(slave)) { >>>> - bond_dev_queue_xmit(bond, skb, slave->dev); >>>> - goto out; >>>> - } >>>> - } >>>> - >>>> - if (slave_agg_no >= 0) { >>>> - netdev_err(dev, "Couldn't find a slave to tx on for >>>> aggregator ID %d\n", >>>> - agg_id); >>>> - goto err_free; >>>> - } >>>> - >>>> - /* we couldn't find any suitable slave after the agg_no, so use >>>> the >>>> - * first suitable found, if found. >>>> - */ >>>> - if (first_ok_slave) >>>> - bond_dev_queue_xmit(bond, skb, first_ok_slave->dev); >>>> - else >>>> - goto err_free; >>>> - >>>> -out: >>>> - return NETDEV_TX_OK; >>>> -err_free: >>>> - /* no suitable interface, frame not sent */ >>>> - dev_kfree_skb_any(skb); >>>> - goto out; >>>> -} >>>> - >>>> int bond_3ad_lacpdu_recv(const struct sk_buff *skb, struct bonding >>>> *bond, >>>> struct slave *slave) >>>> { >>>> diff --git a/drivers/net/bonding/bond_alb.c >>>> b/drivers/net/bonding/bond_alb.c >>>> index 028496205f39..dbac0ceb17f6 100644 >>>> --- a/drivers/net/bonding/bond_alb.c >>>> +++ b/drivers/net/bonding/bond_alb.c >>>> @@ -200,7 +200,6 @@ static int tlb_initialize(struct bonding *bond) >>>> static void tlb_deinitialize(struct bonding *bond) >>>> { >>>> struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); >>>> - struct tlb_up_slave *arr; >>>> >>>> _lock_tx_hashtbl_bh(bond); >>>> >>>> @@ -208,10 +207,6 @@ static void tlb_deinitialize(struct bonding *bond) >>>> bond_info->tx_hashtbl = NULL; >>>> >>>> _unlock_tx_hashtbl_bh(bond); >>>> - >>>> - arr = rtnl_dereference(bond_info->slave_arr); >>>> - if (arr) >>>> - kfree_rcu(arr, rcu); >>>> } >>>> >>>> static long long compute_gap(struct slave *slave) >>>> @@ -1409,39 +1404,9 @@ out: >>>> return NETDEV_TX_OK; >>>> } >>>> >>>> -static int bond_tlb_update_slave_arr(struct bonding *bond, >>>> - struct slave *skipslave) >>>> -{ >>>> - struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); >>>> - struct slave *tx_slave; >>>> - struct list_head *iter; >>>> - struct tlb_up_slave *new_arr, *old_arr; >>>> - >>>> - new_arr = kzalloc(offsetof(struct tlb_up_slave, >>>> arr[bond->slave_cnt]), >>>> - GFP_ATOMIC); >>>> - if (!new_arr) >>>> - return -ENOMEM; >>>> - >>>> - bond_for_each_slave(bond, tx_slave, iter) { >>>> - if (!bond_slave_can_tx(tx_slave)) >>>> - continue; >>>> - if (skipslave == tx_slave) >>>> - continue; >>>> - new_arr->arr[new_arr->count++] = tx_slave; >>>> - } >>>> - >>>> - old_arr = rtnl_dereference(bond_info->slave_arr); >>>> - rcu_assign_pointer(bond_info->slave_arr, new_arr); >>>> - if (old_arr) >>>> - kfree_rcu(old_arr, rcu); >>>> - >>>> - return 0; >>>> -} >>>> - >>>> int bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev) >>>> { >>>> struct bonding *bond = netdev_priv(bond_dev); >>>> - struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); >>>> struct ethhdr *eth_data; >>>> struct slave *tx_slave = NULL; >>>> u32 hash_index; >>>> @@ -1462,12 +1427,14 @@ int bond_tlb_xmit(struct sk_buff *skb, struct >>>> net_device *bond_dev) >>>> hash_index & >>>> 0xFF, >>>> skb->len); >>>> } else { >>>> - struct tlb_up_slave *slaves; >>>> + struct bond_up_slave *slaves; >>>> + unsigned int count; >>>> >>>> - slaves = >>>> rcu_dereference(bond_info->slave_arr); >>>> - if (slaves && slaves->count) >>>> + slaves = rcu_dereference(bond->slave_arr); >>>> + count = slaves ? slaves->count : 0; >>>> + if (count) >>> >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> In both of these cases (slaves & slaves->count) you could use likely() as in >>> the case we have slaves to transmit, it'd be advantageous and will be hit >>> every time. The cases that we get here and don't have a slave_arr/count are >>> mostly 2: >>> 1. The slaves got released between the start of xmit and this part >>> 2. They were not eligible so the array got emptied >>> In both of these cases we don't care about the fallout path since >>> transmission isn't possible anyway. >>> Also another minor thing is that usually in the cases where we want to fetch >>> a variable only once ACCESS_ONCE() is used as a weak compiler barrier to >>> make sure the compiler doesn't optimize out something. The others can >>> correct me if I'm wrong, but I think in this case it's a good precaution for >>> slaves->count. >>> These are merely suggestions, I might be wrong. >>> > Will do. >>> >>>> tx_slave = slaves->arr[hash_index >>>> % >>>> - >>>> slaves->count]; >>>> + count]; >>>> } >>>> break; >>>> } >>>> @@ -1733,10 +1700,6 @@ void bond_alb_deinit_slave(struct bonding *bond, >>>> struct slave *slave) >>>> rlb_clear_slave(bond, slave); >>>> } >>>> >>>> - if (bond_is_nondyn_tlb(bond)) >>>> - if (bond_tlb_update_slave_arr(bond, slave)) >>>> - pr_err("Failed to build slave-array for TLB >>>> mode.\n"); >>>> - >>>> } >>>> >>>> /* Caller must hold bond lock for read */ >>>> @@ -1762,7 +1725,7 @@ void bond_alb_handle_link_change(struct bonding >>>> *bond, struct slave *slave, char >>>> } >>>> >>>> if (bond_is_nondyn_tlb(bond)) { >>>> - if (bond_tlb_update_slave_arr(bond, NULL)) >>>> + if (bond_update_slave_arr(bond, NULL)) >>>> pr_err("Failed to build slave-array for TLB >>>> mode.\n"); >>>> } >>>> } >>>> diff --git a/drivers/net/bonding/bond_alb.h >>>> b/drivers/net/bonding/bond_alb.h >>>> index aaeac61d03cf..5fc76c01636c 100644 >>>> --- a/drivers/net/bonding/bond_alb.h >>>> +++ b/drivers/net/bonding/bond_alb.h >>>> @@ -139,20 +139,12 @@ struct tlb_slave_info { >>>> */ >>>> }; >>>> >>>> -struct tlb_up_slave { >>>> - unsigned int count; >>>> - struct rcu_head rcu; >>>> - struct slave *arr[0]; >>>> -}; >>>> - >>>> struct alb_bond_info { >>>> struct tlb_client_info *tx_hashtbl; /* Dynamically allocated */ >>>> spinlock_t tx_hashtbl_lock; >>>> u32 unbalanced_load; >>>> int tx_rebalance_counter; >>>> int lp_counter; >>>> - /* -------- non-dynamic tlb mode only ---------*/ >>>> - struct tlb_up_slave __rcu *slave_arr; /* Up slaves */ >>>> /* -------- rlb parameters -------- */ >>>> int rlb_enabled; >>>> struct rlb_client_info *rx_hashtbl; /* Receive hash table */ >>>> diff --git a/drivers/net/bonding/bond_main.c >>>> b/drivers/net/bonding/bond_main.c >>>> index b43b2df9e5d1..4412c458939d 100644 >>>> --- a/drivers/net/bonding/bond_main.c >>>> +++ b/drivers/net/bonding/bond_main.c >>>> @@ -1700,6 +1700,10 @@ static int __bond_release_one(struct net_device >>>> *bond_dev, >>>> write_unlock_bh(&bond->curr_slave_lock); >>>> } >>>> >>>> + if (bond_mode_uses_xmit_hash(bond) && >>>> + bond_update_slave_arr(bond, slave)) >>>> + pr_err("Failed to build slave-array.\n"); >>>> + >>>> netdev_info(bond_dev, "Releasing %s interface %s\n", >>>> bond_is_active_slave(slave) ? "active" : "backup", >>>> slave_dev->name); >>>> @@ -2015,6 +2019,10 @@ static void bond_miimon_commit(struct bonding >>>> *bond) >>>> bond_alb_handle_link_change(bond, slave, >>>> BOND_LINK_UP); >>>> >>>> + if (BOND_MODE(bond) == BOND_MODE_XOR && >>>> + bond_update_slave_arr(bond, NULL)) >>>> + pr_err("Failed to build slave-array for >>>> XOR mode.\n"); >>>> + >>>> if (!bond->curr_active_slave || slave == primary) >>>> goto do_failover; >>>> >>>> @@ -2042,6 +2050,10 @@ static void bond_miimon_commit(struct bonding >>>> *bond) >>>> bond_alb_handle_link_change(bond, slave, >>>> >>>> BOND_LINK_DOWN); >>>> >>>> + if (BOND_MODE(bond) == BOND_MODE_XOR && >>>> + bond_update_slave_arr(bond, NULL)) >>>> + pr_err("Failed to build slave-array for >>>> XOR mode.\n"); >>>> + >>>> if (slave == >>>> rcu_access_pointer(bond->curr_active_slave)) >>>> goto do_failover; >>>> >>>> @@ -2505,6 +2517,9 @@ static void bond_loadbalance_arp_mon(struct >>>> work_struct *work) >>>> >>>> if (slave_state_changed) { >>>> bond_slave_state_change(bond); >>>> + if (BOND_MODE(bond) == BOND_MODE_XOR && >>>> + bond_update_slave_arr(bond, NULL)) >>>> + pr_err("Failed to build slave-array for >>>> XOR mode.\n"); >>>> } else if (do_failover) { >>>> /* the bond_select_active_slave must hold RTNL >>>> * and curr_slave_lock for write. >>>> @@ -2899,11 +2914,23 @@ static int bond_slave_netdev_event(unsigned long >>>> event, >>>> if (old_duplex != slave->duplex) >>>> bond_3ad_adapter_duplex_changed(slave); >>>> } >>>> + /* Refresh slave-array if applicable! >>>> + * If the setuo does not use miimon or arpmon >>>> (mode-specific!), >>>> + * then these event will not cause the slave-array to be >>>> + * refreshed. This will cause xmit to use a slave that is >>>> not >>>> + * usable. Avoid such situation by refeshing the array at >>>> these >>>> + * events. If these (miimon/arpmon) parameters are >>>> configured >>>> + * then array gets refreshed twice and that should be >>>> fine! >>>> + */ >>>> + if (bond_mode_uses_xmit_hash(bond) && >>>> + bond_update_slave_arr(bond, NULL)) >>>> + pr_err("Failed to build slave-array for XOR >>>> mode.\n"); >>>> break; >>>> case NETDEV_DOWN: >>>> - /* >>>> - * ... Or is it this? >>>> - */ >>>> + /* Refresh slave-array if applicable! */ >>>> + if (bond_mode_uses_xmit_hash(bond) && >>>> + bond_update_slave_arr(bond, NULL)) >>>> + pr_err("Failed to build slave-array for XOR >>>> mode.\n"); >>>> break; >>>> case NETDEV_CHANGEMTU: >>>> /* >>>> @@ -3147,6 +3174,10 @@ static int bond_open(struct net_device *bond_dev) >>>> bond_3ad_initiate_agg_selection(bond, 1); >>>> } >>>> >>>> + if (bond_mode_uses_xmit_hash(bond) && >>>> + bond_update_slave_arr(bond, NULL)) >>>> + pr_err("Failed to build slave-array for XOR mode.\n"); >>>> + >>>> return 0; >>>> } >>>> >>>> @@ -3654,15 +3685,106 @@ static int bond_xmit_activebackup(struct sk_buff >>>> *skb, struct net_device *bond_d >>>> return NETDEV_TX_OK; >>>> } >>>> >>>> -/* In bond_xmit_xor() , we determine the output device by using a pre- >>>> - * determined xmit_hash_policy(), If the selected device is not enabled, >>>> - * find the next active slave. >>>> +/* Build the usable slaves array in control path for modes that use >>>> xmit-hash >>>> + * to determine the slave interface - >>>> + * (a) BOND_MODE_8023AD >>>> + * (b) BOND_MODE_XOR >>>> + * (c) BOND_MODE_TLB && tlb_dynamic_lb == 0 >>>> */ >>>> -static int bond_xmit_xor(struct sk_buff *skb, struct net_device >>>> *bond_dev) >>>> +int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave) >>>> { >>>> - struct bonding *bond = netdev_priv(bond_dev); >>>> + struct slave *slave; >>>> + struct list_head *iter; >>>> + struct bond_up_slave *new_arr, *old_arr; >>>> + int slaves_in_agg; >>>> + int agg_id = 0; >>>> + int ret = 0; >>>> + >>>> + new_arr = kzalloc(offsetof(struct bond_up_slave, >>>> arr[bond->slave_cnt]), >>>> + GFP_ATOMIC); >>>> + if (!new_arr) { >>>> + ret = -ENOMEM; >>>> + goto out; >>>> + } >>>> + if (BOND_MODE(bond) == BOND_MODE_8023AD) { >>>> + struct ad_info ad_info; >>>> + >>>> + if (bond_3ad_get_active_agg_info(bond, &ad_info)) { >>>> + pr_debug("bond_3ad_get_active_agg_info failed\n"); >>>> + kfree_rcu(new_arr, rcu); >>>> + ret = -EINVAL; >>>> + goto out; >>>> + } >>>> + slaves_in_agg = ad_info.ports; >>>> + agg_id = ad_info.aggregator_id; >>>> + } >>>> + bond_for_each_slave(bond, slave, iter) { >>>> + if (BOND_MODE(bond) == BOND_MODE_8023AD) { >>>> + struct aggregator *agg; >>>> >>>> - bond_xmit_slave_id(bond, skb, bond_xmit_hash(bond, skb) % >>>> bond->slave_cnt); >>>> + agg = SLAVE_AD_INFO(slave)->port.aggregator; >>>> + if (!agg || agg->aggregator_identifier != agg_id) >>>> + continue; >>>> + } >>>> + if (!bond_slave_can_tx(slave)) >>>> + continue; >>>> + if (skipslave == slave) >>>> + continue; >>>> + new_arr->arr[new_arr->count++] = slave; >>>> + } >>>> + >>>> + old_arr = rcu_dereference_protected(bond->slave_arr, >>>> + lockdep_rtnl_is_held() || >>>> + >>>> lockdep_is_held(&bond->curr_slave_lock)); >>>> + rcu_assign_pointer(bond->slave_arr, new_arr); >>>> + if (old_arr) >>>> + kfree_rcu(old_arr, rcu); >>>> + >>>> +out: >>>> + if (ret != 0 && skipslave) { >>>> + int idx; >>>> + >>>> + /* Rare situation where caller has asked to skip a >>>> specific >>>> + * slave but allocation failed (most likely!). BTW this is >>>> + * only possible when the call is initiated from >>>> + * __bond_release_one(). In this sitation; overwrite the >>>> + * skipslave entry in the array with the last entry from >>>> the >>>> + * array to avoid a situation where the xmit path may >>>> choose >>>> + * this to-be-skipped slave to send a packet out. >>>> + */ >>>> + old_arr = rtnl_dereference(bond->slave_arr); >>>> + for (idx = 0; idx < old_arr->count; idx++) { >>>> + if (skipslave == old_arr->arr[idx]) { >>>> + old_arr->arr[idx] = >>>> + old_arr->arr[old_arr->count-1]; >>>> + old_arr->count--; >>>> + break; >>>> + } >>>> + } >>>> + } >>>> + return ret; >>>> +} >>>> + >>>> +/* Use this Xmit function for 3AD as well as XOR modes. The current >>>> + * usable slave array is formed in the control path. The xmit function >>>> + * just calculates hash and sends the packet out. >>>> + */ >>>> +int bond_3ad_xor_xmit(struct sk_buff *skb, struct net_device *dev) >>>> +{ >>>> + struct bonding *bond = netdev_priv(dev); >>>> + struct slave *slave; >>>> + struct bond_up_slave *slaves; >>>> + unsigned int count; >>>> + >>>> + slaves = rcu_dereference(bond->slave_arr); >>>> + count = slaves ? slaves->count : 0; >>>> + if (count) { >>> >>> ^^^^^^^^^^^^^^ >>> The same comment as above applies here, too. >>> >>> >>>> + slave = slaves->arr[bond_xmit_hash(bond, skb) % count]; >>>> + bond_dev_queue_xmit(bond, skb, slave->dev); >>>> + } else { >>>> + dev_kfree_skb_any(skb); >>>> + atomic_long_inc(&dev->tx_dropped); >>>> + } >>>> >>>> return NETDEV_TX_OK; >>>> } >>>> @@ -3764,12 +3886,11 @@ static netdev_tx_t __bond_start_xmit(struct >>>> sk_buff *skb, struct net_device *dev >>>> return bond_xmit_roundrobin(skb, dev); >>>> case BOND_MODE_ACTIVEBACKUP: >>>> return bond_xmit_activebackup(skb, dev); >>>> + case BOND_MODE_8023AD: >>>> case BOND_MODE_XOR: >>>> - return bond_xmit_xor(skb, dev); >>>> + return bond_3ad_xor_xmit(skb, dev); >>>> case BOND_MODE_BROADCAST: >>>> return bond_xmit_broadcast(skb, dev); >>>> - case BOND_MODE_8023AD: >>>> - return bond_3ad_xmit_xor(skb, dev); >>>> case BOND_MODE_ALB: >>>> return bond_alb_xmit(skb, dev); >>>> case BOND_MODE_TLB: >>>> @@ -3947,6 +4068,7 @@ static void bond_uninit(struct net_device *bond_dev) >>>> struct bonding *bond = netdev_priv(bond_dev); >>>> struct list_head *iter; >>>> struct slave *slave; >>>> + struct bond_up_slave *arr; >>>> >>>> bond_netpoll_cleanup(bond_dev); >>>> >>>> @@ -3955,6 +4077,12 @@ static void bond_uninit(struct net_device >>>> *bond_dev) >>>> __bond_release_one(bond_dev, slave->dev, true); >>>> netdev_info(bond_dev, "Released all slaves\n"); >>>> >>>> + arr = rtnl_dereference(bond->slave_arr); >>>> + if (arr) { >>>> + kfree_rcu(arr, rcu); >>>> + RCU_INIT_POINTER(bond->slave_arr, NULL); >>>> + } >>>> + >>>> list_del(&bond->bond_list); >>>> >>>> bond_debug_unregister(bond); >>>> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h >>>> index 8375133dd347..78d6d3b7a780 100644 >>>> --- a/drivers/net/bonding/bonding.h >>>> +++ b/drivers/net/bonding/bonding.h >>>> @@ -177,6 +177,12 @@ struct slave { >>>> struct kobject kobj; >>>> }; >>>> >>>> +struct bond_up_slave { >>>> + unsigned int count; >>>> + struct rcu_head rcu; >>>> + struct slave *arr[0]; >>>> +}; >>>> + >>>> /* >>>> * Link pseudo-state only used internally by monitors >>>> */ >>>> @@ -193,6 +199,7 @@ struct bonding { >>>> struct slave __rcu *curr_active_slave; >>>> struct slave __rcu *current_arp_slave; >>>> struct slave __rcu *primary_slave; >>>> + struct bond_up_slave __rcu *slave_arr; /* Array of usable slaves >>>> */ >>>> bool force_primary; >>>> s32 slave_cnt; /* never change this value outside the >>>> attach/detach wrappers */ >>>> int (*recv_probe)(const struct sk_buff *, struct bonding *, >>>> @@ -530,6 +537,7 @@ const char *bond_slave_link_status(s8 link); >>>> struct bond_vlan_tag *bond_verify_device_path(struct net_device >>>> *start_dev, >>>> struct net_device *end_dev, >>>> int level); >>>> +int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave); >>>> >>>> #ifdef CONFIG_PROC_FS >>>> void bond_create_proc_entry(struct bonding *bond); >>>> >>> > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@...r.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists