lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1292549726-15957-1-git-send-email-fubar@us.ibm.com>
Date:	Thu, 16 Dec 2010 17:35:24 -0800
From:	Jay Vosburgh <fubar@...ibm.com>
To:	netdev@...r.kernel.org
Cc:	Andy Gospodarek <andy@...yhouse.net>
Subject: [PATCH RFC v3 0/2] bonding: generic netlink, multi-link mode

	[ v3: moved up to today's net-next-2.6, cleaned up various cruft,
	  checkpatch stuff ]

	These patches add support to bonding for generic netlink and a new
multi-link mode.  At the moment, I'm looking primarily for discussion
about the generic netlink and implementation of multi-link.

	First, in patch 1, is a generic netlink infrastructure for
bonding.  This patch provides a "get mode" command and a "slave link state
change" asychnronous notification via a netlink multicast group.  One long
term goal is to have bonding be controlled via netlink, both for
administrative purposes (add / remove slaves, etc) and policy (slave A is
better than slave B).  I'd appreciate feedback from netlink savvy folks as
to whether this is the appropriate starting point.

	Second, in patch 2, is the multi-link kernel code itself, which is
at present a work in progress.  Here, I'm primarily looking for comments
regarding the control interface for this mode.

	As implemented, this is a new mode to bonding, controlled via
generic netlink commands from a user space daemon.  Slave assignment for
outgoing traffic is handled directly by bonding (the mapping table used by
multi-link is within bonding itself, and the usual transmit hash policy is
applied to the set of slaves allowable for a given destination).

	In some private discussion with Andy, he suggested that this would
be better if it utilized the recently added queue mapping facility within
bonding, and then having the queue (and thus slave) assignments performed
at the qdisc level (via a tc filter) instead of within bonding itself.
This, I believe, would require a new tc filter that implements the ability
to set a skb queue_mapping in a hash (of protocol data in the packet) or
round robin fashion.  In this case, the tc filter would also incorporate
all of the netlink functionality for communicating with the user space
daemon (to permit the mappings to be updated).

	Thoughts?

	Lastly, a description of the multi-link system itself.  This is a
reimplementation of a load balancing scheme that has been available on AIX
for some time.  It operates essentially as a load balancer by subnet, with
a UDP-based protocol to exchange multi-link topology information between
participating systems.  Hosts participating in multi-link have IP
addresses in a separate subnet.  Interfaces enslaved to multi-link do not
lose their assigned IP address information, and may also operate
separately from multi-link.

	One notable feature is that multi-link provides load balancing
facilities for network devices that cannot change their MAC address, such
as Infiniband.

	For example, given two systems as follows:

host A:
bond0		10.88.0.1/16
slave eth0	10.0.0.1/16
slave eth1	10.1.0.1/16
slave eth2	10.2.0.1/16

host B:
bond0		10.88.0.2/16
slave eth0	10.0.0.2/16
slave eth1	10.1.0.2/16
slave eth2	10.2.0.2/16

	in this case, host A's bond0 running multi-link would load balance
traffic from 10.88.0.1 to 10.88.0.2 across eth0, eth1 and eth2.  The user
space daemon negotiates the link set to use with other participating
hosts, and communicates that to the multi-link implementation.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ