lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 Aug 2014 18:18:56 +0200
From:	Jiri Pirko <jiri@...nulli.us>
To:	netdev@...r.kernel.org
Cc:	davem@...emloft.net, nhorman@...driver.com, andy@...yhouse.net,
	tgraf@...g.ch, dborkman@...hat.com, ogerlitz@...lanox.com,
	jesse@...ira.com, pshelar@...ira.com, azhou@...ira.com,
	ben@...adent.org.uk, stephen@...workplumber.org,
	jeffrey.t.kirsher@...el.com, vyasevic@...hat.com,
	xiyou.wangcong@...il.com, john.r.fastabend@...el.com,
	edumazet@...gle.com, jhs@...atatu.com, sfeldma@...ulusnetworks.com,
	f.fainelli@...il.com, roopa@...ulusnetworks.com,
	linville@...driver.com, dev@...nvswitch.org, jasowang@...hat.com,
	ebiederm@...ssion.com, nicolas.dichtel@...nd.com,
	ryazanov.s.a@...il.com, buytenh@...tstofly.org,
	aviadr@...lanox.com, nbd@...nwrt.org, alexei.starovoitov@...il.com,
	Neil.Jerram@...aswitch.com, ronye@...lanox.com
Subject: [patch net-next RFC 03/12] net: introduce generic switch devices support

The goal of this is to provide a possibility to suport various switch
chips. Drivers should implement relevant ndos to do so. Now there is a
couple of ndos defines:
- for getting physical switch id is in place.
- for work with flows.

Note that user can use random port netdevice to access the switch.

Signed-off-by: Jiri Pirko <jiri@...nulli.us>
---
 Documentation/networking/switchdev.txt |  53 +++++++++++
 include/linux/netdevice.h              |  28 ++++++
 include/linux/switchdev.h              |  44 +++++++++
 net/Kconfig                            |   6 ++
 net/core/Makefile                      |   1 +
 net/core/switchdev.c                   | 163 +++++++++++++++++++++++++++++++++
 6 files changed, 295 insertions(+)
 create mode 100644 Documentation/networking/switchdev.txt
 create mode 100644 include/linux/switchdev.h
 create mode 100644 net/core/switchdev.c

diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
new file mode 100644
index 0000000..435746a
--- /dev/null
+++ b/Documentation/networking/switchdev.txt
@@ -0,0 +1,53 @@
+Switch device drivers HOWTO
+===========================
+
+First lets describe a topology a bit. Imagine the following example:
+
+       +----------------------------+    +---------------+
+       |     SOME switch chip       |    |      CPU      |
+       +----------------------------+    +---------------+
+       port1 port2 port3 port4 MNGMNT    |     PCI-E     |
+         |     |     |     |     |       +---------------+
+        PHY   PHY    |     |     |         |  NIC0 NIC1
+                     |     |     |         |   |    |
+                     |     |     +- PCI-E -+   |    |
+                     |     +------- MII -------+    |
+                     +------------- MII ------------+
+
+In this example, there are two independent lines between the switch silicon
+and CPU. NIC0 and NIC1 drivers are not aware of a switch presence. They are
+separate from the switch driver. SOME switch chip is by managed by a driver
+via PCI-E device MNGMNT. Note that MNGMNT device, NIC0 and NIC1 may be
+connected to some other type of bus.
+
+Now, for the previous example show the representation in kernel:
+
+       +----------------------------+    +---------------+
+       |     SOME switch chip       |    |      CPU      |
+       +----------------------------+    +---------------+
+       sw0p0 sw0p1 sw0p2 sw0p3 MNGMNT    |     PCI-E     |
+         |     |     |     |     |       +---------------+
+        PHY   PHY    |     |     |         |  eth0 eth1
+                     |     |     |         |   |    |
+                     |     |     +- PCI-E -+   |    |
+                     |     +------- MII -------+    |
+                     +------------- MII ------------+
+
+Lets call the example switch driver for SOME switch chip "SOMEswitch". This
+driver takes care of PCI-E device MNGMNT. There is a netdevice instance sw0pX
+created for each port of a switch. These netdevices are instances
+of "SOMEswitch" driver. sw0pX netdevices serve as a "representation"
+of the switch chip. eth0 and eth1 are instances of some other existing driver.
+
+The only difference of the switch-port netdevice from the ordinary netdevice
+is that is implements couple more NDOs:
+
+	ndo_swdev_get_id - This returns the same ID for two port netdevices of
+			   the same physical switch chip. This is mandatory to
+			   be implemented by all switch drivers and serves
+			   the caller for recognition of a port netdevice.
+	ndo_swdev_* - Functions that serve for a manipulation of the switch chip
+		      itself. They are not port-specific. Caller might use
+		      arbitrary port netdevice of the same switch and it will
+		      make no difference.
+	ndo_swportdev_* - Functions that serve for a port-specific manipulation.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 39294b9..8b5d14c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -49,6 +49,8 @@
 
 #include <linux/netdev_features.h>
 #include <linux/neighbour.h>
+#include <linux/sw_flow.h>
+
 #include <uapi/linux/netdevice.h>
 
 struct netpoll_info;
@@ -997,6 +999,24 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	Callback to use for xmit over the accelerated station. This
  *	is used in place of ndo_start_xmit on accelerated net
  *	devices.
+ *
+ * int (*ndo_swdev_get_id)(struct net_device *dev,
+ *			   struct netdev_phys_item_id *psid);
+ *	Called to get an ID of the switch chip this port is part of.
+ *	If driver implements this, it indicates that it represents a port
+ *	of a switch chip.
+ *
+ * int (*ndo_swdev_flow_insert)(struct net_device *dev,
+ *				const struct sw_flow *flow);
+ *	Called to insert a flow into switch device. If driver does
+ *	not implement this, it is assumed that the hw does not have
+ *	a capability to work with flows.
+ *
+ * int (*ndo_swdev_flow_remove)(struct net_device *dev,
+ *				const struct sw_flow *flow);
+ *	Called to remove a flow from switch device. If driver does
+ *	not implement this, it is assumed that the hw does not have
+ *	a capability to work with flows.
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1146,6 +1166,14 @@ struct net_device_ops {
 							struct net_device *dev,
 							void *priv);
 	int			(*ndo_get_lock_subclass)(struct net_device *dev);
+#ifdef CONFIG_NET_SWITCHDEV
+	int			(*ndo_swdev_get_id)(struct net_device *dev,
+						    struct netdev_phys_item_id *psid);
+	int			(*ndo_swdev_flow_insert)(struct net_device *dev,
+							 const struct sw_flow *flow);
+	int			(*ndo_swdev_flow_remove)(struct net_device *dev,
+							 const struct sw_flow *flow);
+#endif
 };
 
 /**
diff --git a/include/linux/switchdev.h b/include/linux/switchdev.h
new file mode 100644
index 0000000..ba77a68
--- /dev/null
+++ b/include/linux/switchdev.h
@@ -0,0 +1,44 @@
+/*
+ * include/linux/switchdev.h - Switch device API
+ * Copyright (c) 2014 Jiri Pirko <jiri@...nulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#ifndef _LINUX_SWITCHDEV_H_
+#define _LINUX_SWITCHDEV_H_
+
+#include <linux/netdevice.h>
+#include <linux/sw_flow.h>
+
+#ifdef CONFIG_NET_SWITCHDEV
+
+int swdev_get_id(struct net_device *dev, struct netdev_phys_item_id *psid);
+int swdev_flow_insert(struct net_device *dev, const struct sw_flow *flow);
+int swdev_flow_remove(struct net_device *dev, const struct sw_flow *flow);
+
+#else
+
+static inline int swdev_get_id(struct net_device *dev,
+			       struct netdev_phys_item_id *psid)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int swdev_flow_insert(struct net_device *dev,
+				    const struct sw_flow *flow)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int swdev_flow_remove(struct net_device *dev,
+				    const struct sw_flow *flow)
+{
+	return -EOPNOTSUPP;
+}
+
+#endif
+
+#endif /* _LINUX_SWITCHDEV_H_ */
diff --git a/net/Kconfig b/net/Kconfig
index 4051fdf..40f729f 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -290,6 +290,12 @@ config NET_FLOW_LIMIT
 	  with many clients some protection against DoS by a single (spoofed)
 	  flow that greatly exceeds average workload.
 
+config NET_SWITCHDEV
+	boolean "Switch device support"
+	depends on INET
+	---help---
+	  This module provides support for hardware switch chips.
+
 menu "Network testing"
 
 config NET_PKTGEN
diff --git a/net/core/Makefile b/net/core/Makefile
index 71093d9..8583c38 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -24,3 +24,4 @@ obj-$(CONFIG_NETWORK_PHY_TIMESTAMPING) += timestamping.o
 obj-$(CONFIG_NET_PTP_CLASSIFY) += ptp_classifier.o
 obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o
 obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o
+obj-$(CONFIG_NET_SWITCHDEV) += switchdev.o
diff --git a/net/core/switchdev.c b/net/core/switchdev.c
new file mode 100644
index 0000000..4fad097
--- /dev/null
+++ b/net/core/switchdev.c
@@ -0,0 +1,163 @@
+/*
+ * net/core/switchdev.c - Switch device API
+ * Copyright (c) 2014 Jiri Pirko <jiri@...nulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/init.h>
+#include <linux/netdevice.h>
+#include <linux/switchdev.h>
+
+/**
+ *	swdev_get_id - Get ID of a switch
+ *	@dev: port device
+ *	@psid: switch ID
+ *
+ *	Get ID of a switch this port is part of.
+ */
+int swdev_get_id(struct net_device *dev, struct netdev_phys_item_id *psid)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	if (!ops->ndo_swdev_get_id)
+		return -EOPNOTSUPP;
+	return ops->ndo_swdev_get_id(dev, psid);
+}
+EXPORT_SYMBOL(swdev_get_id);
+
+static void print_flow_key_tun(const char *prefix,
+			       const struct sw_flow_key *key)
+{
+	pr_debug("%s tun  { id %08llx, s %pI4, d %pI4, f %02x, tos %x, ttl %x }\n",
+		 prefix,
+		 be64_to_cpu(key->tun_key.tun_id), &key->tun_key.ipv4_src,
+		 &key->tun_key.ipv4_dst, ntohs(key->tun_key.tun_flags),
+		 key->tun_key.ipv4_tos, key->tun_key.ipv4_ttl);
+}
+
+static void print_flow_key_phy(const char *prefix,
+			       const struct sw_flow_key *key)
+{
+	pr_debug("%s phy  { prio %04x, mark %04x, in_port %02x }\n",
+		 prefix,
+		 key->phy.priority, key->phy.skb_mark, key->phy.in_port);
+}
+
+static void print_flow_key_eth(const char *prefix,
+			       const struct sw_flow_key *key)
+{
+	pr_debug("%s eth  { sm %pM, dm %pM, tci %04x, type %04x }\n",
+		 prefix,
+		 key->eth.src, key->eth.dst, ntohs(key->eth.tci),
+		 ntohs(key->eth.type));
+}
+
+static void print_flow_key_ip(const char *prefix,
+			      const struct sw_flow_key *key)
+{
+	pr_debug("%s ip   { proto %02x, tos %02x, ttl %02x }\n",
+		 prefix,
+		 key->ip.proto, key->ip.tos, key->ip.ttl);
+}
+
+static void print_flow_key_ipv4(const char *prefix,
+				const struct sw_flow_key *key)
+{
+	pr_debug("%s ipv4 { si %pI4, di %pI4, sm %pM, dm %pM }\n",
+		 prefix,
+		 &key->ipv4.addr.src, &key->ipv4.addr.dst,
+		 key->ipv4.arp.sha, key->ipv4.arp.tha);
+}
+
+static void print_flow_actions(struct sw_flow_actions *actions)
+{
+	int i;
+
+	pr_debug("  actions:\n");
+	if (!actions)
+		return;
+	for (i = 0; i < actions->count; i++) {
+		struct sw_flow_action *action = &actions->actions[i];
+
+		switch (action->type) {
+		case SW_FLOW_ACTION_TYPE_OUTPUT:
+			pr_debug("    output    { dev %s }\n",
+				 action->output_dev->name);
+			break;
+		case SW_FLOW_ACTION_TYPE_VLAN_PUSH:
+			pr_debug("    vlan push { proto %04x, tci %04x }\n",
+				 ntohs(action->vlan.vlan_proto),
+				 ntohs(action->vlan.vlan_tci));
+			break;
+		case SW_FLOW_ACTION_TYPE_VLAN_POP:
+			pr_debug("    vlan pop\n");
+			break;
+		}
+	}
+}
+
+#define PREFIX_NONE "      "
+#define PREFIX_MASK "  mask"
+
+static void print_flow(const struct sw_flow *flow, struct net_device *dev,
+		       const char *comment)
+{
+	pr_debug("%s flow %s (%x-%x):\n", dev->name, comment,
+		 flow->mask->range.start, flow->mask->range.end);
+	print_flow_key_tun(PREFIX_NONE, &flow->key);
+	print_flow_key_tun(PREFIX_MASK, &flow->mask->key);
+	print_flow_key_phy(PREFIX_NONE, &flow->key);
+	print_flow_key_phy(PREFIX_MASK, &flow->mask->key);
+	print_flow_key_eth(PREFIX_NONE, &flow->key);
+	print_flow_key_eth(PREFIX_MASK, &flow->mask->key);
+	print_flow_key_ip(PREFIX_NONE, &flow->key);
+	print_flow_key_ip(PREFIX_MASK, &flow->mask->key);
+	print_flow_key_ipv4(PREFIX_NONE, &flow->key);
+	print_flow_key_ipv4(PREFIX_MASK, &flow->mask->key);
+	print_flow_actions(flow->actions);
+}
+
+/**
+ *	swdev_flow_insert - Insert a flow into switch
+ *	@dev: port device
+ *	@flow: flow descriptor
+ *
+ *	Insert a flow into switch this port is part of.
+ */
+int swdev_flow_insert(struct net_device *dev, const struct sw_flow *flow)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	print_flow(flow, dev, "insert");
+	if (!ops->ndo_swdev_flow_insert)
+		return -EOPNOTSUPP;
+	WARN_ON(!ops->ndo_swdev_get_id);
+	BUG_ON(!flow->actions);
+	return ops->ndo_swdev_flow_insert(dev, flow);
+}
+EXPORT_SYMBOL(swdev_flow_insert);
+
+/**
+ *	swdev_flow_remove - Remove a flow from switch
+ *	@dev: port device
+ *	@flow: flow descriptor
+ *
+ *	Remove a flow from switch this port is part of.
+ */
+int swdev_flow_remove(struct net_device *dev, const struct sw_flow *flow)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	print_flow(flow, dev, "remove");
+	if (!ops->ndo_swdev_flow_remove)
+		return -EOPNOTSUPP;
+	WARN_ON(!ops->ndo_swdev_get_id);
+	return ops->ndo_swdev_flow_remove(dev, flow);
+}
+EXPORT_SYMBOL(swdev_flow_remove);
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists