netdev - Re: bonding and SR-IOV -- do we need arp_validation for loadbalancing too?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <500F108F.6020706@gmail.com>
Date:	Tue, 24 Jul 2012 23:15:59 +0200
From:	Nicolas de Pesloüan 
	<nicolas.2p.debian@...il.com>
To:	Jay Vosburgh <fubar@...ibm.com>, Jiri Pirko <jiri@...nulli.us>
CC:	Chris Friesen <chris.friesen@...band.com>,
	netdev <netdev@...r.kernel.org>, andy@...yhouse.net
Subject: Re: bonding and SR-IOV -- do we need arp_validation for loadbalancing
 too?

Le 24/07/2012 22:49, Jay Vosburgh a écrit :
[...]
>> In loadbalance mode wouldn't it just work similar to active-backup?  If
>> it's a reply then verify that it came from the arp target, if it's a
>> request then check to see if it came from one of the other slaves.
>
> 	The problem isn't verifying the requests or replies, it's that
> the ARP packets are not distributed across all slaves (because the
> switch ports are in a channel group / aggregator), so some slaves do not
> receive any ARPs.
>
> 	The bond sends the ARP request as a broadcast.  For
> active-backup, this ends up at the inactive slaves because the switch
> sends the broadcast to all ports.  For a loadbalance mode, the switch
> won't send the broadcast ARP to the other slaves, because all the slaves
> are in a channel group or lacp aggregator, which is treated by the
> switch as effectively a single switch port for this case.
>
> 	Similarly, the ARP replies are unicast, and the switch will send
> those unicast replies to only one member of the channel group or
> aggregator.  The choice there is usually a hash of some kind, so
> generally only one slave will receive the replies.

I assume team should suffer the exact same problem, because most of this is on the switch side and 
out of the control of the host. Jiri, can you confirm?

[...]

> 	I believe bonding is the main user of last_rx (a search shows a
> couple of drivers using it internally).  For bonding use, in current
> mainline last_rx is set by bonding itself, not in the network device
> driver.

If last_rx is set and used internally by bonding and mostly unused elsewhere, can't we remove it 
from net_device and move it into private data for the slaves in bonding?

A comment in netdevice.h even recommends not to set it into drivers:

         unsigned long           last_rx;        /* Time of last Rx
                                                  * This should not be set in
                                                  * drivers, unless really needed,
                                                  * because network stack (bonding)
                                                  * use it if/when necessary, to
                                                  * avoid dirtying this cache line.
                                                  */

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html