lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 08 Aug 2011 09:44:59 -0700
From:	Jay Vosburgh <fubar@...ibm.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
cc:	David Lamparter <equinox@...c24.net>,
	Phillip Susi <psusi@....rr.com>, netdev@...r.kernel.org
Subject: Re: 802.3ad bonding brain damaged?

Eric Dumazet <eric.dumazet@...il.com> wrote:

>Le lundi 08 août 2011 à 09:57 +0200, David Lamparter a écrit :
>> Am Sonntag, den 07.08.2011, 15:52 -0400 schrieb Phillip Susi:
>> > - From Documentation/networking/bonding.txt:
>> > 
>> > 	Additionally, the linux bonding 802.3ad implementation
>> > 	distributes traffic by peer (using an XOR of MAC addresses),
>> > 
>> > This is counter to the entire point of 802.3ad. Distributing traffic by
>> > hash of the destination address is poor mans load balancing for
>> > systems not supporting 802.3ad. 
>> 
>> No, it isn't. 802.3ad/.1AX explicitly requires that no packet
>> re-ordering may ever occur, which can only be guaranteed by enqueueing
>> packets for one host on one TX interface. This behaviour is mandated by
>> 802.1AX-2008 page 15 which reads:
>> 
>>   This standard does not mandate any particular distribution
>>   algorithm(s); however, any distribution algorithm shall ensure that,
>>   when frames are received by a Frame Collector as specified in 5.2.3,
>>   the algorithm shall not cause
>>   a) Misordering of frames that are part of any given conversation, or
>>   b) Duplication of frames.
>> | The above requirement to maintain frame ordering is met by ensuring
>> | that all frames that compose a given conversation are transmitted on a
>> | single link in the order that they are generated by the MAC Client;
>>   hence, this requirement does not involve the addition (or
>>   modification) of any information to the MAC frame, nor any buffering
>>   or processing on the part of the corresponding Frame Collector in
>>   order to reorder frames. This approach to the operation of the
>>   distribution function permits a wide variety of distribution and load
>>   balancing algorithms to be used, while also ensuring interoperability
>>   between devices that adopt differing algorithms.
>> 
>
>It all depends on the definition of 'conversation'

	The definition from 802.1AX is:

3.8 conversation: A set of frames transmitted from one end station to
another, where all of the frames form an ordered sequence, and where the
communicating end stations require the ordering to be maintained among
the set of frames exchanged. (See IEEE Std 802.1AX, Clause 5.)

	So, basically, a TCP connection or a sequence of UDP datagrams
from one IP.port to another and optionally the reverse.

>Phillip assumed two (or more) TCP flows from machine A to machine B
>could use two different links, while you assert they MUST use a single
>link.

	The standard permits us to place separate conversations on
different ports, even if they are going to the same MAC destination.  

	802.1AX 5.2.1:

f) Frame ordering must be maintained for certain sequences of frame
exchanges between MAC Clients (known as conversations, see Clause
3). The Distributor ensures that all frames of a given conversation are
passed to a single port. For any given port, the Collector is required
to pass frames to the MAC Client in the order that they are received
from that port. The Collector is otherwise free to select frames
received from the aggregated ports in any order. Since there are no
means for frames to be misordered on a single link, this guarantees that
frame ordering is maintained for any conversation.

g) Conversations may be moved among ports within an aggregation, both
for load balancing and to maintain availability in the event of link
failures.

	The standard requires ordering for frames within any one
conversation, but does not require ordering of frames between
conversations.

	The layer2 (MAC) and layer3 (MAC + IP) hashes in bonding are
compliant to this.  The layer3+4 (IP + TCP/UDP port) is not, because
fragmented datagrams will hash differently than unfragmented datagrams.
I've not heard that this noncompliance has been a problem in actual
practice.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ