lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090501173916.GA6608@gospo.rdu.redhat.com>
Date:	Fri, 1 May 2009 13:39:16 -0400
From:	Andy Gospodarek <andy@...yhouse.net>
To:	netdev@...r.kernel.org, fubar@...ibm.com
Cc:	bonding-devel@...ts.sourceforge.net
Subject: [PATCH] bonding: add mark mode


Quite a few people are happy with the way bonding manages to split
traffic to different members of a bond.  Many are quite disappointed
that users or administrators cannot be given more control over the
traffic distribution and would like something that they can control more
easily.  I looked at extending some of the existing modes, but the
cleanest option seemed to be one that created an additional mode to
handle this case.  I hated to create yet another mode, but the
simplicity of this mode made it a nice candidate for a new mode.  I have
decided to call this mode 'mark' (or mode 7).

The mark mode of bonding relies on the skb->mark field for outgoing
device selection.  Unmarked frames (ones where the mark is still zero),
will be sent by the first active enslaved device.  Any marked frames
will choose the outgoing device based on result of the modulo of the mark
and the number of enslaved devices.  If that device is inactive
(link-down), the traffic will default back to the first active enslaved
device.  I debated how to use the mark to decide the outgoing device,
but it seemed that modulo of the mark and the number of enslaved devices
would provide the most flexibility for those who currently mark frames
for other purposes.

I considered some other options for choosing destination devices based
on marks, but the ones I came up would require additional sysfs
configuration parameters and I would prefer not to add any more to an
already crowded space.

I've tested this on a slightly older kernel than the net-next-2.6 tree
than this patch is against by marking frames using mark and connmark
iptables options and it seems to work as I expect.

Signed-off-by: Andy Gospodarek <andy@...yhouse.net>
---

 Documentation/networking/bonding.txt |   25 ++++++++++++++
 drivers/net/bonding/bond_main.c      |   59 ++++++++++++++++++++++++++++++++++-
 include/linux/if_bonding.h           |    1 
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 0876275..7a0d4c2 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -582,6 +582,31 @@ mode
 		swapped with the new curr_active_slave that was
 		chosen.
 
+	mark or 7
+
+		Mark-based policy: skbuffs that arrive to be
+		transmitted will have the mark field inspected to
+		determine the destination slave device.  When the
+		skbuff's mark is zero, the first active device in the
+		ordered list of enslaved devices will be used.  When
+		the mark is non-zero the modulo of the mark and the
+		number of enslaved devices will determine the
+		interface used for transmission.  If this device is
+		not active (link-down) then the mark will essentially
+		be ignored and the first active device in the ordered
+		list of enslaved devices will be used.
+
+		The flexibility offered with this mode allows users
+		of netfilter to move various types of traffic to
+		different slaves quite easily.  Information on this
+		can be found in the manpages for iptables/ebtables
+		as well as netfilter documentation.
+
+		Prerequisites:
+
+		1.  Without the ability to mark skbuffs this mode is
+		not useful.  Netfilter greatly aides skbuff marking.
+
 num_grat_arp
 
 	Specifies the number of gratuitous ARPs to be issued after a
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index fd73836..5e1d166 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -123,7 +123,7 @@ module_param(mode, charp, 0);
 MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, "
 		       "1 for active-backup, 2 for balance-xor, "
 		       "3 for broadcast, 4 for 802.3ad, 5 for balance-tlb, "
-		       "6 for balance-alb");
+		       "6 for balance-alb, 7 for mark");
 module_param(primary, charp, 0);
 MODULE_PARM_DESC(primary, "Primary network device to use");
 module_param(lacp_rate, charp, 0);
@@ -175,6 +175,7 @@ const struct bond_parm_tbl bond_mode_tbl[] = {
 {	"802.3ad",		BOND_MODE_8023AD},
 {	"balance-tlb",		BOND_MODE_TLB},
 {	"balance-alb",		BOND_MODE_ALB},
+{	"mark",			BOND_MODE_MARK},
 {	NULL,			-1},
 };
 
@@ -224,6 +225,7 @@ static const char *bond_mode_name(int mode)
 		[BOND_MODE_8023AD]= "IEEE 802.3ad Dynamic link aggregation",
 		[BOND_MODE_TLB] = "transmit load balancing",
 		[BOND_MODE_ALB] = "adaptive load balancing",
+		[BOND_MODE_MARK] = "mark-based transmit balancing",
 	};
 
 	if (mode < 0 || mode > BOND_MODE_ALB)
@@ -4464,6 +4466,57 @@ out:
 	return 0;
 }
 
+static int bond_xmit_mark(struct sk_buff *skb, struct net_device *bond_dev)
+{
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *slave;
+	int i, slave_no, res = 1;
+
+	read_lock(&bond->lock);
+
+	if (!BOND_IS_OK(bond)) {
+		goto out;
+	}
+
+	/* Use the mark as the determining factor for which slave to
+	 * choose for transmission.  When behaving normally all should
+	 * work just fine.  When a slave that is destined to be the
+	 * transmitter of this frame is down, start at the front of the
+	 * list and find the first available slave. */
+
+	slave_no = skb->mark ? skb->mark % bond->slave_cnt : 0;
+
+	bond_for_each_slave(bond, slave, i) {
+		slave_no--;
+		if (slave_no < 0) {
+			break;
+		}
+	}
+
+	if (IS_UP(slave->dev) &&
+	    (slave->link == BOND_LINK_UP) &&
+	    (slave->state == BOND_STATE_ACTIVE)) {
+		res = bond_dev_queue_xmit(bond, skb, slave->dev);
+	} else {
+		/* If desired slave is down, send on the first up and
+		 * active slave in the list. */
+		bond_for_each_slave(bond, slave, i) {
+			if (IS_UP(slave->dev) &&
+			    (slave->link == BOND_LINK_UP) &&
+			    (slave->state == BOND_STATE_ACTIVE)) {
+				res = bond_dev_queue_xmit(bond, skb, slave->dev);
+				break;
+			}
+		}
+	}
+out:
+	if (res) {
+		/* no suitable interface, frame not sent */
+		dev_kfree_skb(skb);
+	}
+	read_unlock(&bond->lock);
+	return 0;
+}
 /*------------------------- Device initialization ---------------------------*/
 
 static void bond_set_xmit_hash_policy(struct bonding *bond)
@@ -4500,6 +4553,8 @@ static int bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	case BOND_MODE_ALB:
 	case BOND_MODE_TLB:
 		return bond_alb_xmit(skb, dev);
+	case BOND_MODE_MARK:
+		return bond_xmit_mark(skb, dev);
 	default:
 		/* Should never happen, mode already checked */
 		printk(KERN_ERR DRV_NAME ": %s: Error: Unknown bonding mode %d\n",
@@ -4537,6 +4592,8 @@ void bond_set_mode_ops(struct bonding *bond, int mode)
 		/* FALLTHRU */
 	case BOND_MODE_TLB:
 		break;
+	case BOND_MODE_MARK:
+		break;
 	default:
 		/* Should never happen, mode already checked */
 		printk(KERN_ERR DRV_NAME
diff --git a/include/linux/if_bonding.h b/include/linux/if_bonding.h
index 65c2d24..253098f 100644
--- a/include/linux/if_bonding.h
+++ b/include/linux/if_bonding.h
@@ -70,6 +70,7 @@
 #define BOND_MODE_8023AD        4
 #define BOND_MODE_TLB           5
 #define BOND_MODE_ALB		6 /* TLB + RLB (receive load balancing) */
+#define BOND_MODE_MARK		7
 
 /* each slave's link has 4 states */
 #define BOND_LINK_UP    0           /* link is up and running */
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ