linux-kernel - Re: [PATCH net-next] bonding: correct rr balancing during link failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <X9kwqvgoAmrjAaXY@black-debian>
Date:   Tue, 15 Dec 2020 21:54:54 +0000
From:   Lars Everbrand <lars.everbrand@...tonmail.com>
To:     Jay Vosburgh <jay.vosburgh@...onical.com>
Cc:     Jakub Kicinski <kuba@...nel.org>, linux-kernel@...r.kernel.org,
        Veaceslav Falico <vfalico@...il.com>,
        Andy Gospodarek <andy@...yhouse.net>,
        "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next] bonding: correct rr balancing during link failure

On Tue, Dec 08, 2020 at 01:46:09PM -0800, Jay Vosburgh wrote:
> 
> Jakub Kicinski <kuba@...nel.org> wrote:
> 
> >On Wed, 02 Dec 2020 20:55:57 +0000 Lars Everbrand wrote:
> 	Are these bandwidth numbers from observation of the actual
> behavior?  I'm not sure the real system would behave this way; my
> suspicion is that it would increase the likelihood of drops on the
> overused slave, not that the overall capacity would be limited.
I tested this with with 2 VMs and 5 bridges with bandwidth limitiation
via 'virsh domiftune' to bring the speed down to something similar to 
100Mbit/s.

iperf results:

with patch:
---
working       iperf
interfaces    speed [mbit/s]
5             442
4             363
3             278
2             199
1             107

without patch:
---
working       iperf
interfaces    speed [mbit/s]
5             444
4             226
3             155
2             129
1             107

The speed at 5x100 is not going as high as I expected but the
sub-optimal speed is still visible.

Note that the degradation tested is with downing interfaces sequentially
which is the worst-case for this problem.

> >Looking at the code in question it feels a little like we're breaking
> >abstractions if we bump the counter directly in get_slave_by_id.
> 
> 	Agreed; I think a better way to fix this is to enable the slave
> array for balance-rr mode, and then use the array to find the right
> slave.  This way, we then avoid the problematic "skip unable to tx"
> logic for free.
> 
> >For one thing when the function is called for IGMP packets the counter
> >should not be incremented at all. But also if packets_per_slave is not
> >1 we'd still be hitting the same leg multiple times (packets_per_slave
> >/ 2). So it seems like we should round the counter up somehow?
> >
> >For IGMP maybe we don't have to call bond_get_slave_by_id() at all,
> >IMHO, just find first leg that can TX. Then we can restructure
> >bond_get_slave_by_id() appropriately for the non-IGMP case.
> 
> 	For IGMP, the theory is to confine that traffic to a single
> device.  Normally, this will be curr_active_slave, which is updated even
> in balance-rr mode as interfaces are added to or removed from the bond.
> The call to bond_get_slave_by_id should be a fallback in case
> curr_active_slave is empty, and should be the exception, and may not be
> possible at all.
> 
> 	But either way, the IGMP path shouldn't mess with rr_tx_counter,
> it should be out of band of the normal TX packet counting, so to speak.
> 
> 	-J
> 
> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >> index e0880a3840d7..e02d9c6d40ee 100644
> >> --- a/drivers/net/bonding/bond_main.c
> >> +++ b/drivers/net/bonding/bond_main.c
> >> @@ -4107,6 +4107,7 @@ static struct slave *bond_get_slave_by_id(struct bonding *bond,
> >>  		if (--i < 0) {
> >>  			if (bond_slave_can_tx(slave))
> >>  				return slave;
> >> +			bond->rr_tx_counter++;
> >>  		}
> >>  	}
> >>
> >> @@ -4117,6 +4118,7 @@ static struct slave *bond_get_slave_by_id(struct bonding *bond,
> >>  			break;
> >>  		if (bond_slave_can_tx(slave))
> >>  			return slave;
> >> +		bond->rr_tx_counter++;
> >>  	}
> >>  	/* no slave that can tx has been found */
> >>  	return NULL;
> >
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@...onical.com