lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Sun, 5 Oct 2014 05:52:10 +0530
From:	Mahesh Bandewar <maheshb@...gle.com>
To:	Nikolay Aleksandrov <nikolay@...hat.com>
Cc:	Jay Vosburgh <j.vosburgh@...il.com>,
	Veaceslav Falico <vfalico@...il.com>,
	Andy Gospodarek <andy@...yhouse.net>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>,
	Eric Dumazet <edumazet@...gle.com>,
	Maciej Zenczykowski <maze@...gle.com>,
	Cong Wang <cwang@...pensource.com>
Subject: Re: [PATCH v7 net-next 2/2] bonding: Simplify the xmit function for
 modes that use xmit_hash

On Sat, Oct 4, 2014 at 1:07 PM, Nikolay Aleksandrov <nikolay@...hat.com> wrote:
> On 10/04/2014 02:48 AM, Mahesh Bandewar wrote:
>> Earlier change to use usable slave array for TLB mode had an additional
>> performance advantage. So extending the same logic to all other modes
>> that use xmit-hash for slave selection (viz 802.3AD, and XOR modes).
>> Also consolidating this with the earlier TLB change.
>>
>> The main idea is to build the usable slaves array in the control path
>> and use that array for slave selection during xmit operation.
>>
>> Measured performance in a setup with a bond of 4x1G NICs with 200
>> instances of netperf for the modes involved (3ad, xor, tlb)
>> cmd: netperf -t TCP_RR -H <TargetHost> -l 60 -s 5
>>
>> Mode        TPS-Before   TPS-After
>>
>> 802.3ad   : 468,694      493,101
>> TLB (lb=0): 392,583      392,965
>> XOR       : 475,696      484,517
>>
>> Signed-off-by: Mahesh Bandewar <maheshb@...gle.com>
>> Signed-off-by: Nikolay Aleksandrov <nikolay@...hat.com>
>> ---
>> v1:
>>   (a) If bond_update_slave_arr() fails to allocate memory, it will overwrite
>>       the slave that need to be removed.
>>   (b) Freeing of array will assign NULL (to handle bond->down to bond->up
>>       transition gracefully.
>>   (c) Change from pr_debug() to pr_err() if bond_update_slave_arr() returns
>>       failure.
>>   (d) XOR: bond_update_slave_arr() will consider mii-mon, arp-mon cases and
>>       will populate the array even if these parameters are not used.
>>   (e) 3AD: Should handle the ad_agg_selection_logic correctly.
>> v2:
>>   (a) Removed rcu_read_{un}lock() calls from array manipulation code.
>>   (b) Slave link-events now refresh array for all these modes.
>>   (c) Moved free-array call from bond_close() to bond_uninit().
>> v3:
>>   (a) Fixed null pointer dereference.
>>   (b) Removed bond->lock lockdep dependency.
>> v4:
>>   (a) Made to changes to comply with Nikolay's locking changes
>>   (b) Added a work-queue to refresh slave-array when RTNL is not held
>>   (c) Array refresh happens ONLY with RTNL now.
>>   (d) alloc changed from GFP_ATOMIC to GFP_KERNEL
>> v5:
>>   (a) Consolidated all delayed slave-array updates at one place in
>>       3ad_state_machine_handler()
>> v6:
>>   (a) Free slave array when there is no active aggregator
>> v7:
>>   (a) Couple of trivial changes.
>>
>>  drivers/net/bonding/bond_3ad.c  | 140 +++++++++++------------------
>>  drivers/net/bonding/bond_alb.c  |  51 ++---------
>>  drivers/net/bonding/bond_alb.h  |   8 --
>>  drivers/net/bonding/bond_main.c | 192 +++++++++++++++++++++++++++++++++++++---
>>  drivers/net/bonding/bonding.h   |  10 +++
>>  5 files changed, 249 insertions(+), 152 deletions(-)
>>
> <<<snip>>>
>> +/* Build the usable slaves array in control path for modes that use xmit-hash
>> + * to determine the slave interface -
>> + * (a) BOND_MODE_8023AD
>> + * (b) BOND_MODE_XOR
>> + * (c) BOND_MODE_TLB && tlb_dynamic_lb == 0
>> + *
>> + * The caller is expected to hold RTNL only and NO other lock!
>> + */
>> +int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
>> +{
>> +     struct slave *slave;
>> +     struct list_head *iter;
>> +     struct bond_up_slave *new_arr, *old_arr;
>> +     int slaves_in_agg;
>> +     int agg_id = 0;
>> +     int ret = 0;
>> +
>> +#ifdef CONFIG_LOCKDEP
>> +     lockdep_assert_held(&bond->mode_lock);
>> +#endif
> ^^^^^^^^^
> This is wrong now, the logic is inverted.
> It will WARN every time mode_lock is _not_ held:
>
> #define lockdep_assert_held(l)  do {                            \
>                 WARN_ON(debug_locks && !lockdep_is_held(l));    \
>         } while (0)
>
> The previous version was correct which did a WARN when mode_lock was
> actually held as that is the wrong condition, not when it's not held.
> I've missed that comment earlier.
>
Thanks Nik, I missed that. I'll revert it!

> (also switched Veaceslav's email address with the correct one in the CC list)
>
>> +
>> +     new_arr = kzalloc(offsetof(struct bond_up_slave, arr[bond->slave_cnt]),
>> +                       GFP_KERNEL);
>> +     if (!new_arr) {
>> +             ret = -ENOMEM;
>> +             pr_err("Failed to build slave-array.\n");
>> +             goto out;
>> +     }
>> +     if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>> +             struct ad_info ad_info;
>> +
>> +             if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
>> +                     pr_debug("bond_3ad_get_active_agg_info failed\n");
>> +                     kfree_rcu(new_arr, rcu);
>> +                     /* No active aggragator means it's not safe to use
>> +                      * the previous array.
>> +                      */
>> +                     old_arr = rtnl_dereference(bond->slave_arr);
>> +                     if (old_arr) {
>> +                             RCU_INIT_POINTER(bond->slave_arr, NULL);
>> +                             kfree_rcu(old_arr, rcu);
>> +                     }
>> +                     goto out;
>> +             }
>> +             slaves_in_agg = ad_info.ports;
>> +             agg_id = ad_info.aggregator_id;
>> +     }
>> +     bond_for_each_slave(bond, slave, iter) {
>> +             if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>> +                     struct aggregator *agg;
>> +
>> +                     agg = SLAVE_AD_INFO(slave)->port.aggregator;
>> +                     if (!agg || agg->aggregator_identifier != agg_id)
>> +                             continue;
>> +             }
>> +             if (!bond_slave_can_tx(slave))
>> +                     continue;
>> +             if (skipslave == slave)
>> +                     continue;
>> +             new_arr->arr[new_arr->count++] = slave;
>> +     }
>> +
>> +     old_arr = rtnl_dereference(bond->slave_arr);
>> +     rcu_assign_pointer(bond->slave_arr, new_arr);
>> +     if (old_arr)
>> +             kfree_rcu(old_arr, rcu);
>> +out:
>> +     if (ret != 0 && skipslave) {
>> +             int idx;
>> +
>> +             /* Rare situation where caller has asked to skip a specific
>> +              * slave but allocation failed (most likely!). BTW this is
>> +              * only possible when the call is initiated from
>> +              * __bond_release_one(). In this situation; overwrite the
>> +              * skipslave entry in the array with the last entry from the
>> +              * array to avoid a situation where the xmit path may choose
>> +              * this to-be-skipped slave to send a packet out.
>> +              */
>> +             old_arr = rtnl_dereference(bond->slave_arr);
>> +             for (idx = 0; idx < old_arr->count; idx++) {
>> +                     if (skipslave == old_arr->arr[idx]) {
>> +                             old_arr->arr[idx] =
>> +                                 old_arr->arr[old_arr->count-1];
>> +                             old_arr->count--;
>> +                             break;
>> +                     }
>> +             }
>> +     }
>> +     return ret;
>> +}
>> +
> <<<snip>>>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ