lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111021140005.GA29815@synalogic.ca>
Date:	Fri, 21 Oct 2011 10:00:05 -0400
From:	Benjamin Poirier <benjamin.poirier@...il.com>
To:	Jiri Pirko <jpirko@...hat.com>
Cc:	netdev@...r.kernel.org, davem@...emloft.net,
	eric.dumazet@...il.com, bhutchings@...arflare.com,
	shemminger@...tta.com, fubar@...ibm.com, andy@...yhouse.net,
	tgraf@...radead.org, ebiederm@...ssion.com, mirqus@...il.com,
	kaber@...sh.net, greearb@...delatech.com, jesse@...ira.com,
	fbl@...hat.com, jzupka@...hat.com
Subject: Re: [patch net-next V2] net: introduce ethernet teaming device

On 11/10/21 14:39, Jiri Pirko wrote:
> This patch introduces new network device called team. It supposes to be
> very fast, simple, userspace-driven alternative to existing bonding
> driver.
> 
> Userspace library called libteam with couple of demo apps is available
> here:
> https://github.com/jpirko/libteam
> Note it's still in its dipers atm.
> 
> team<->libteam use generic netlink for communication. That and rtnl
> suppose to be the only way to configure team device, no sysfs etc.
> 
> Python binding basis for libteam was recently introduced (some need
> still need to be done on it though). Daemon providing arpmon/miimon
> active-backup functionality will be introduced shortly.
> All what's necessary is already implemented in kernel team driver.
> 
> Signed-off-by: Jiri Pirko <jpirko@...hat.com>
> 
> v1->v2:
> 	- modes are made as modules. Makes team more modular and
> 	  extendable.
> 	- several commenters' nitpicks found on v1 were fixed
> 	- several other bugs were fixed.
> 	- note I ignored Eric's comment about roundrobin port selector
> 	  as Eric's way may be easily implemented as another mode (mode
> 	  "random") in future.
> ---
>  Documentation/networking/team.txt         |    2 +
>  MAINTAINERS                               |    7 +
>  drivers/net/Kconfig                       |    2 +
>  drivers/net/Makefile                      |    1 +
>  drivers/net/team/Kconfig                  |   38 +
>  drivers/net/team/Makefile                 |    7 +
>  drivers/net/team/team.c                   | 1593 +++++++++++++++++++++++++++++
>  drivers/net/team/team_mode_activebackup.c |  152 +++
>  drivers/net/team/team_mode_roundrobin.c   |  107 ++
>  include/linux/Kbuild                      |    1 +
>  include/linux/if.h                        |    1 +
>  include/linux/if_team.h                   |  233 +++++
>  12 files changed, 2144 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/networking/team.txt
>  create mode 100644 drivers/net/team/Kconfig
>  create mode 100644 drivers/net/team/Makefile
>  create mode 100644 drivers/net/team/team.c
>  create mode 100644 drivers/net/team/team_mode_activebackup.c
>  create mode 100644 drivers/net/team/team_mode_roundrobin.c
>  create mode 100644 include/linux/if_team.h
> 
> diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
> new file mode 100644
> index 0000000..5a01368
> --- /dev/null
> +++ b/Documentation/networking/team.txt
> @@ -0,0 +1,2 @@
> +Team devices are driven from userspace via libteam library which is here:
> +	https://github.com/jpirko/libteam
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5008b08..c33400d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6372,6 +6372,13 @@ W:	http://tcp-lp-mod.sourceforge.net/
>  S:	Maintained
>  F:	net/ipv4/tcp_lp.c
>  
> +TEAM DRIVER
> +M:	Jiri Pirko <jpirko@...hat.com>
> +L:	netdev@...r.kernel.org
> +S:	Supported
> +F:	drivers/net/team/
> +F:	include/linux/if_team.h
> +
>  TEGRA SUPPORT
>  M:	Colin Cross <ccross@...roid.com>
>  M:	Erik Gilling <konkers@...roid.com>
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 583f66c..b3020be 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -125,6 +125,8 @@ config IFB
>  	  'ifb1' etc.
>  	  Look at the iproute2 documentation directory for usage etc
>  
> +source "drivers/net/team/Kconfig"
> +
>  config MACVLAN
>  	tristate "MAC-VLAN support (EXPERIMENTAL)"
>  	depends on EXPERIMENTAL
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index fa877cd..4e4ebfe 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -17,6 +17,7 @@ obj-$(CONFIG_NET) += Space.o loopback.o
>  obj-$(CONFIG_NETCONSOLE) += netconsole.o
>  obj-$(CONFIG_PHYLIB) += phy/
>  obj-$(CONFIG_RIONET) += rionet.o
> +obj-$(CONFIG_NET_TEAM) += team/
>  obj-$(CONFIG_TUN) += tun.o
>  obj-$(CONFIG_VETH) += veth.o
>  obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> diff --git a/drivers/net/team/Kconfig b/drivers/net/team/Kconfig
> new file mode 100644
> index 0000000..70a43a6
> --- /dev/null
> +++ b/drivers/net/team/Kconfig
> @@ -0,0 +1,38 @@
> +menuconfig NET_TEAM
> +	tristate "Ethernet team driver support (EXPERIMENTAL)"
> +	depends on EXPERIMENTAL
> +	---help---
> +	  This allows one to create virtual interfaces that teams together
> +	  multiple ethernet devices.
> +
> +	  Team devices can be added using the "ip" command from the
> +	  iproute2 package:
> +
> +	  "ip link add link [ address MAC ] [ NAME ] type team"
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called team.
> +
> +if NET_TEAM
> +
> +config NET_TEAM_MODE_ROUNDROBIN
> +	tristate "Round-robin mode support"
> +	depends on NET_TEAM
> +	---help---
> +	  Basic mode where port used for transmitting packets is selected in
> +	  round-robin fashion using packet counter.
> +
> +	  To compile this team mode as a module, choose M here: the module
> +	  will be called team_mode_roundrobin.
> +
> +config NET_TEAM_MODE_ACTIVEBACKUP
> +	tristate "Active-backup mode support"
> +	depends on NET_TEAM
> +	---help---
> +	  Only one port is active at a time and the rest of ports are used
> +	  for backup.
> +
> +	  To compile this team mode as a module, choose M here: the module
> +	  will be called team_mode_activebackup.
> +
> +endif # NET_TEAM
> diff --git a/drivers/net/team/Makefile b/drivers/net/team/Makefile
> new file mode 100644
> index 0000000..85f2028
> --- /dev/null
> +++ b/drivers/net/team/Makefile
> @@ -0,0 +1,7 @@
> +#
> +# Makefile for the network team driver
> +#
> +
> +obj-$(CONFIG_NET_TEAM) += team.o
> +obj-$(CONFIG_NET_TEAM_MODE_ROUNDROBIN) += team_mode_roundrobin.o
> +obj-$(CONFIG_NET_TEAM_MODE_ACTIVEBACKUP) += team_mode_activebackup.o
> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
> new file mode 100644
> index 0000000..398be58
> --- /dev/null
> +++ b/drivers/net/team/team.c
> @@ -0,0 +1,1593 @@
> +/*
> + * net/drivers/team/team.c - Network team device driver
> + * Copyright (c) 2011 Jiri Pirko <jpirko@...hat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/rcupdate.h>
> +#include <linux/errno.h>
> +#include <linux/ctype.h>
> +#include <linux/notifier.h>
> +#include <linux/netdevice.h>
> +#include <linux/if_arp.h>
> +#include <linux/socket.h>
> +#include <linux/etherdevice.h>
> +#include <linux/rtnetlink.h>
> +#include <net/rtnetlink.h>
> +#include <net/genetlink.h>
> +#include <net/netlink.h>
> +#include <linux/if_team.h>
> +
> +#define DRV_NAME "team"
> +
> +
> +/**********
> + * Helpers
> + **********/
> +
> +#define team_port_exists(dev) (dev->priv_flags & IFF_TEAM_PORT)
> +
> +static struct team_port *team_port_get_rcu(const struct net_device *dev)
> +{
> +	struct team_port *port = rcu_dereference(dev->rx_handler_data);
> +
> +	return team_port_exists(dev) ? port : NULL;
> +}
> +
> +static struct team_port *team_port_get_rtnl(const struct net_device *dev)
> +{
> +	struct team_port *port = rtnl_dereference(dev->rx_handler_data);
> +
> +	return team_port_exists(dev) ? port : NULL;
> +}
> +
> +/*
> + * Since the ability to change mac address for open port device is tested in
> + * team_port_add, this function can be called without control of return value
> + */
> +static int __set_port_mac(struct net_device *port_dev,
> +			  const unsigned char *dev_addr)
> +{
> +	struct sockaddr addr;
> +
> +	memcpy(addr.sa_data, dev_addr, ETH_ALEN);
> +	addr.sa_family = ARPHRD_ETHER;
> +	return dev_set_mac_address(port_dev, &addr);
> +}
> +
> +int team_port_set_orig_mac(struct team_port *port)
> +{
> +	return __set_port_mac(port->dev, port->orig.dev_addr);
> +}
> +EXPORT_SYMBOL(team_port_set_orig_mac);
> +
> +int team_port_set_team_mac(struct team_port *port)
> +{
> +	return __set_port_mac(port->dev, port->team->dev->dev_addr);
> +}
> +EXPORT_SYMBOL(team_port_set_team_mac);
> +
> +
> +/*******************
> + * Options handling
> + *******************/
> +
> +void team_options_register(struct team *team, struct team_option *option,
> +			   size_t option_count)
> +{
> +	int i;
> +
> +	for (i = 0; i < option_count; i++, option++)
> +		list_add_tail(&option->list, &team->option_list);
> +}
> +EXPORT_SYMBOL(team_options_register);
> +
> +static void __team_options_change_check(struct team *team,
> +					struct team_option *changed_option);
> +
> +static void __team_options_unregister(struct team *team,
> +				      struct team_option *option,
> +				      size_t option_count)
> +{
> +	int i;
> +
> +	for (i = 0; i < option_count; i++, option++)
> +		list_del(&option->list);
> +}
> +
> +void team_options_unregister(struct team *team, struct team_option *option,
> +			     size_t option_count)
> +{
> +	__team_options_unregister(team, option, option_count);
> +	__team_options_change_check(team, NULL);
> +}
> +EXPORT_SYMBOL(team_options_unregister);
> +
> +static int team_option_get(struct team *team, struct team_option *option,
> +			   void *arg)
> +{
> +	return option->getter(team, arg);
> +}
> +
> +static int team_option_set(struct team *team, struct team_option *option,
> +			   void *arg)
> +{
> +	int err;
> +
> +	err = option->setter(team, arg);
> +	if (err)
> +		return err;
> +
> +	__team_options_change_check(team, option);
> +	return err;
> +}
> +
> +/****************
> + * Mode handling
> + ****************/
> +
> +static LIST_HEAD(mode_list);
> +static DEFINE_SPINLOCK(mode_list_lock);
> +
> +static struct team_mode *__find_mode(const char *kind)
> +{
> +	struct team_mode *mode;
> +
> +	list_for_each_entry(mode, &mode_list, list) {
> +		if (strcmp(mode->kind, kind) == 0)
> +			return mode;
> +	}
> +	return NULL;
> +}
> +
> +static bool is_good_mode_name(const char *name)
> +{
> +	while (*name != '\0') {
> +		if (!isalpha(*name) && !isdigit(*name) && *name != '_')
> +			return false;
> +		name++;
> +	}
> +	return true;
> +}
> +
> +int team_mode_register(struct team_mode *mode)
> +{
> +	int err = 0;
> +
> +	if (!is_good_mode_name(mode->kind) ||
> +	    mode->priv_size > TEAM_MODE_PRIV_SIZE)
> +		return -EINVAL;
> +	spin_lock(&mode_list_lock);
> +	if (__find_mode(mode->kind)) {
> +		err = -EEXIST;
> +		goto unlock;
> +	}
> +	list_add_tail(&mode->list, &mode_list);
> +unlock:
> +	spin_unlock(&mode_list_lock);
> +	return err;
> +}
> +EXPORT_SYMBOL(team_mode_register);
> +
> +int team_mode_unregister(struct team_mode *mode)
> +{
> +	spin_lock(&mode_list_lock);
> +	list_del_init(&mode->list);
> +	spin_unlock(&mode_list_lock);
> +	return 0;
> +}
> +EXPORT_SYMBOL(team_mode_unregister);
> +
> +static struct team_mode *team_mode_get(const char *kind)
> +{
> +	struct team_mode *mode;
> +
> +	spin_lock(&mode_list_lock);
> +	mode = __find_mode(kind);
> +	if (!mode) {
> +		spin_unlock(&mode_list_lock);
> +		request_module("team-mode-%s", kind);
> +		spin_lock(&mode_list_lock);
> +		mode = __find_mode(kind);
> +	}
> +	if (mode)
> +		if (!try_module_get(mode->owner))
> +			mode = NULL;
> +
> +	spin_unlock(&mode_list_lock);
> +	return mode;
> +}
> +
> +static void team_mode_put(const char *kind)
> +{
> +	struct team_mode *mode;
> +
> +	spin_lock(&mode_list_lock);
> +	mode = __find_mode(kind);
> +	BUG_ON(!mode);
> +	module_put(mode->owner);
> +	spin_unlock(&mode_list_lock);
> +}
> +
> +/*
> + * We can benefit from the fact that it's ensured no port is present
> + * at the time of mode change.
> + */
> +static int __team_change_mode(struct team *team,
> +			      const struct team_mode *new_mode)
> +{
> +	/* Check if mode was previously set and do cleanup if so */
> +	if (team->mode_kind) {
> +		void (*exit_op)(struct team *team) = team->mode_ops.exit;
> +
> +		/* Clear ops area so no callback is called any longer */
> +		memset(&team->mode_ops, 0, sizeof(struct team_mode_ops));
> +
> +		synchronize_rcu();
> +
> +		if (exit_op)
> +			exit_op(team);
> +		team_mode_put(team->mode_kind);
> +		team->mode_kind = NULL;
> +		/* zero private data area */
> +		memset(&team->mode_priv, 0,
> +		       sizeof(struct team) - offsetof(struct team, mode_priv));
> +	}
> +
> +	if (!new_mode)
> +		return 0;
> +
> +	if (new_mode->ops->init) {
> +		int err;
> +
> +		err = new_mode->ops->init(team);
> +		if (err)
> +			return err;
> +	}
> +
> +	team->mode_kind = new_mode->kind;
> +	memcpy(&team->mode_ops, new_mode->ops, sizeof(struct team_mode_ops));
> +
> +	return 0;
> +}
> +
> +static int team_change_mode(struct team *team, const char *kind)
> +{
> +	struct team_mode *new_mode;
> +	struct net_device *dev = team->dev;
> +	int err;
> +
> +	if (!list_empty(&team->port_list)) {
> +		netdev_err(dev, "No ports can be present during mode change\n");
> +		return -EBUSY;
> +	}
> +
> +	if (team->mode_kind && strcmp(team->mode_kind, kind) == 0) {
> +		netdev_err(dev, "Unable to change to the same mode the team is in\n");
> +		return -EINVAL;
> +	}
> +
> +	new_mode = team_mode_get(kind);
> +	if (!new_mode) {
> +		netdev_err(dev, "Mode \"%s\" not found\n", kind);
> +		return -EINVAL;
> +	}
> +
> +	err = __team_change_mode(team, new_mode);
> +	if (err) {
> +		netdev_err(dev, "Failed to change to mode \"%s\"\n", kind);
> +		team_mode_put(kind);
> +		return err;
> +	}
> +
> +	netdev_info(dev, "Mode changed to \"%s\"\n", kind);
> +	return 0;
> +}
> +
> +
> +/************************
> + * Rx path frame handler
> + ************************/
> +
> +/* note: already called with rcu_read_lock */
> +static rx_handler_result_t team_handle_frame(struct sk_buff **pskb)
> +{
> +	struct sk_buff *skb = *pskb;
> +	struct team_port *port;
> +	struct team *team;
> +	rx_handler_result_t res = RX_HANDLER_ANOTHER;
> +
> +	skb = skb_share_check(skb, GFP_ATOMIC);
> +	if (!skb)
> +		return RX_HANDLER_CONSUMED;
> +
> +	*pskb = skb;
> +
> +	port = team_port_get_rcu(skb->dev);
> +	team = port->team;
> +
> +	if (team->mode_ops.receive)
> +		res = team->mode_ops.receive(team, port, skb);
> +
> +	if (res == RX_HANDLER_ANOTHER) {
> +		struct team_pcpu_stats *pcpu_stats;
> +
> +		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
> +		u64_stats_update_begin(&pcpu_stats->syncp);
> +		pcpu_stats->rx_packets++;
> +		pcpu_stats->rx_bytes += skb->len;
> +		if (skb->pkt_type == PACKET_MULTICAST)
> +			pcpu_stats->rx_multicast++;
> +		u64_stats_update_end(&pcpu_stats->syncp);
> +
> +		skb->dev = team->dev;
> +	} else {
> +		this_cpu_inc(team->pcpu_stats->rx_dropped);
> +	}
> +
> +	return res;
> +}
> +
> +
> +/****************
> + * Port handling
> + ****************/
> +
> +static bool team_port_find(const struct team *team,
> +			   const struct team_port *port)
> +{
> +	struct team_port *cur;
> +
> +	list_for_each_entry(cur, &team->port_list, list)
> +		if (cur == port)
> +			return true;
> +	return false;
> +}
> +
> +static int team_port_list_init(struct team *team)
> +{
> +	int i;
> +	struct hlist_head *hash;
> +
> +	hash = kmalloc(sizeof(*hash) * TEAM_PORT_HASHENTRIES, GFP_KERNEL);
> +	if (!hash)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < TEAM_PORT_HASHENTRIES; i++)
> +		INIT_HLIST_HEAD(&hash[i]);
> +	team->port_hlist = hash;
> +	INIT_LIST_HEAD(&team->port_list);
> +	return 0;
> +}
> +
> +static void team_port_list_fini(struct team *team)
> +{
> +	kfree(team->port_hlist);
> +}
> +
> +/*
> + * Add/delete port to the team port list. Write guarded by rtnl_lock.
> + * Takes care of correct port->index setup (might be racy).
> + */
> +static void team_port_list_add_port(struct team *team,
> +				    struct team_port *port)
> +{
> +	port->index = team->port_count++;
> +	hlist_add_head_rcu(&port->hlist,
> +			   team_port_index_hash(team, port->index));
> +	list_add_tail_rcu(&port->list, &team->port_list);
> +}
> +
> +static void __reconstruct_port_hlist(struct team *team, int rm_index)
> +{
> +	int i;
> +	struct team_port *port;
> +
> +	for (i = rm_index + 1; i < team->port_count; i++) {
> +		port = team_get_port_by_index_rcu(team, i);
> +		hlist_del_rcu(&port->hlist);
> +		port->index--;
> +		hlist_add_head_rcu(&port->hlist,
> +				   team_port_index_hash(team, port->index));
> +	}
> +}
> +
> +static void team_port_list_del_port(struct team *team,
> +				   struct team_port *port)
> +{
> +	int rm_index = port->index;
> +
> +	hlist_del_rcu(&port->hlist);
> +	list_del_rcu(&port->list);
> +	__reconstruct_port_hlist(team, rm_index);
> +	team->port_count--;
> +}
> +
> +#define TEAM_VLAN_FEATURES (NETIF_F_ALL_CSUM | NETIF_F_SG | \
> +			    NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
> +			    NETIF_F_HIGHDMA | NETIF_F_LRO)
> +
> +static void __team_compute_features(struct team *team)
> +{
> +	struct team_port *port;
> +	u32 vlan_features = TEAM_VLAN_FEATURES;
> +	unsigned short max_hard_header_len = ETH_HLEN;
> +
> +	list_for_each_entry(port, &team->port_list, list) {
> +		vlan_features = netdev_increment_features(vlan_features,
> +					port->dev->vlan_features,
> +					TEAM_VLAN_FEATURES);
> +
> +		if (port->dev->hard_header_len > max_hard_header_len)
> +			max_hard_header_len = port->dev->hard_header_len;
> +	}
> +
> +	team->dev->vlan_features = vlan_features;
> +	team->dev->hard_header_len = max_hard_header_len;
> +
> +	netdev_change_features(team->dev);
> +}
> +
> +static void team_compute_features(struct team *team)
> +{
> +	spin_lock(&team->lock);
> +	__team_compute_features(team);
> +	spin_unlock(&team->lock);
> +}
> +
> +static int team_port_enter(struct team *team, struct team_port *port)
> +{
> +	int err = 0;
> +
> +	dev_hold(team->dev);
> +	port->dev->priv_flags |= IFF_TEAM_PORT;
> +	if (team->mode_ops.port_enter) {
> +		err = team->mode_ops.port_enter(team, port);
> +		if (err)
> +			netdev_err(team->dev, "Device %s failed to enter team mode\n",
> +				   port->dev->name);
> +	}
> +	return err;
> +}
> +
> +static void team_port_leave(struct team *team, struct team_port *port)
> +{
> +	if (team->mode_ops.port_leave)
> +		team->mode_ops.port_leave(team, port);
> +	port->dev->priv_flags &= ~IFF_TEAM_PORT;
> +	dev_put(team->dev);
> +}
> +
> +static void __team_port_change_check(struct team_port *port, bool linkup);
> +
> +static int team_port_add(struct team *team, struct net_device *port_dev)
> +{
> +	struct net_device *dev = team->dev;
> +	struct team_port *port;
> +	char *portname = port_dev->name;
> +	char tmp_addr[ETH_ALEN];
> +	int err;
> +
> +	if (port_dev->flags & IFF_LOOPBACK ||
> +	    port_dev->type != ARPHRD_ETHER) {
> +		netdev_err(dev, "Device %s is of an unsupported type\n",
> +			   portname);
> +		return -EINVAL;
> +	}
> +
> +	if (team_port_exists(port_dev)) {
> +		netdev_err(dev, "Device %s is already a port "
> +				"of a team device\n", portname);
> +		return -EBUSY;
> +	}
> +
> +	if (port_dev->flags & IFF_UP) {
> +		netdev_err(dev, "Device %s is up. Set it down before adding it as a team port\n",
> +			   portname);
> +		return -EBUSY;
> +	}
> +
> +	port = kzalloc(sizeof(struct team_port), GFP_KERNEL);
> +	if (!port)
> +		return -ENOMEM;
> +
> +	port->dev = port_dev;
> +	port->team = team;
> +
> +	port->orig.mtu = port_dev->mtu;
> +	err = dev_set_mtu(port_dev, dev->mtu);
> +	if (err) {
> +		netdev_dbg(dev, "Error %d calling dev_set_mtu\n", err);
> +		goto err_set_mtu;
> +	}
> +
> +	memcpy(port->orig.dev_addr, port_dev->dev_addr, ETH_ALEN);
> +	random_ether_addr(tmp_addr);
> +	err = __set_port_mac(port_dev, tmp_addr);
> +	if (err) {
> +		netdev_dbg(dev, "Device %s mac addr set failed\n",
> +			   portname);
> +		goto err_set_mac_rand;
> +	}
> +
> +	err = dev_open(port_dev);
> +	if (err) {
> +		netdev_dbg(dev, "Device %s opening failed\n",
> +			   portname);
> +		goto err_dev_open;
> +	}
> +
> +	err = team_port_set_orig_mac(port);
> +	if (err) {
> +		netdev_dbg(dev, "Device %s mac addr set failed - Device does not support addr change when it's opened\n",
> +			   portname);
> +		goto err_set_mac_opened;
> +	}
> +
> +	err = team_port_enter(team, port);
> +	if (err) {
> +		netdev_err(dev, "Device %s failed to enter team mode\n",
> +			   portname);
> +		goto err_port_enter;
> +	}
> +
> +	err = netdev_set_master(port_dev, dev);
> +	if (err) {
> +		netdev_err(dev, "Device %s failed to set master\n", portname);
> +		goto err_set_master;
> +	}
> +
> +	err = netdev_rx_handler_register(port_dev, team_handle_frame,
> +					 port);
> +	if (err) {
> +		netdev_err(dev, "Device %s failed to register rx_handler\n",
> +			   portname);
> +		goto err_handler_register;
> +	}
> +
> +	team_port_list_add_port(team, port);
> +	__team_compute_features(team);
> +	__team_port_change_check(port, !!netif_carrier_ok(port_dev));
> +
> +	netdev_info(dev, "Port device %s added\n", portname);
> +
> +	return 0;
> +
> +err_handler_register:
> +	netdev_set_master(port_dev, NULL);
> +
> +err_set_master:
> +	team_port_leave(team, port);
> +
> +err_port_enter:
> +err_set_mac_opened:
> +	dev_close(port_dev);
> +
> +err_dev_open:
> +	team_port_set_orig_mac(port);
> +
> +err_set_mac_rand:
> +	dev_set_mtu(port_dev, port->orig.mtu);
> +
> +err_set_mtu:
> +	kfree(port);
> +
> +	return err;
> +}
> +
> +static int team_port_del(struct team *team, struct net_device *port_dev)
> +{
> +	struct net_device *dev = team->dev;
> +	struct team_port *port;
> +	char *portname = port_dev->name;
> +
> +	port = team_port_get_rtnl(port_dev);
> +	if (!port || !team_port_find(team, port)) {
> +		netdev_err(dev, "Device %s does not act as a port of this team\n",
> +			   portname);
> +		return -ENOENT;
> +	}
> +
> +	__team_port_change_check(port, false);
> +	team_port_list_del_port(team, port);
> +	netdev_rx_handler_unregister(port_dev);
> +	netdev_set_master(port_dev, NULL);
> +	team_port_leave(team, port);
> +	dev_close(port_dev);
> +	team_port_set_orig_mac(port);
> +	dev_set_mtu(port_dev, port->orig.mtu);
> +	synchronize_rcu();
> +	kfree(port);
> +	netdev_info(dev, "Port device %s removed\n", portname);
> +	__team_compute_features(team);
> +
> +	return 0;
> +}
> +
> +
> +/*****************
> + * Net device ops
> + *****************/
> +
> +static const char team_no_mode_kind[] = "*NOMODE*";
> +
> +static int team_mode_option_get(struct team *team, void *arg)
> +{
> +	const char **str = arg;
> +
> +	*str = team->mode_kind ? team->mode_kind : team_no_mode_kind;
> +	return 0;
> +}
> +
> +static int team_mode_option_set(struct team *team, void *arg)
> +{
> +	const char **str = arg;
> +
> +	return team_change_mode(team, *str);
> +}
> +
> +static struct team_option team_options[] = {
> +	{
> +		.name = "mode",
> +		.type = TEAM_OPTION_TYPE_STRING,
> +		.getter = team_mode_option_get,
> +		.setter = team_mode_option_set,
> +	},
> +};
> +
> +static int team_init(struct net_device *dev)
> +{
> +	struct team *team = netdev_priv(dev);
> +	int err;
> +
> +	team->dev = dev;
> +	spin_lock_init(&team->lock);
> +
> +	team->pcpu_stats = alloc_percpu(struct team_pcpu_stats);
> +	if (!team->pcpu_stats)
> +		return -ENOMEM;
> +
> +	err = team_port_list_init(team);
> +	if (err)
> +		goto err_port_list_init;
> +
> +	INIT_LIST_HEAD(&team->option_list);
> +	team_options_register(team, team_options, ARRAY_SIZE(team_options));
> +	netif_carrier_off(dev);
> +
> +	return 0;
> +
> +err_port_list_init:
> +
> +	free_percpu(team->pcpu_stats);
> +
> +	return err;
> +}
> +
> +static void team_uninit(struct net_device *dev)
> +{
> +	struct team *team = netdev_priv(dev);
> +	struct team_port *port;
> +	struct team_port *tmp;
> +
> +	spin_lock(&team->lock);
> +	list_for_each_entry_safe(port, tmp, &team->port_list, list)
> +		team_port_del(team, port->dev);
> +
> +	__team_change_mode(team, NULL); /* cleanup */
> +	__team_options_unregister(team, team_options, ARRAY_SIZE(team_options));
> +	spin_unlock(&team->lock);
> +}
> +
> +static void team_destructor(struct net_device *dev)
> +{
> +	struct team *team = netdev_priv(dev);
> +
> +	team_port_list_fini(team);
> +	free_percpu(team->pcpu_stats);
> +	free_netdev(dev);
> +}
> +
> +static int team_open(struct net_device *dev)
> +{
> +	netif_carrier_on(dev);
> +	return 0;
> +}
> +
> +static int team_close(struct net_device *dev)
> +{
> +	netif_carrier_off(dev);
> +	return 0;
> +}
> +
> +/*
> + * note: already called with rcu_read_lock
> + */
> +static netdev_tx_t team_xmit(struct sk_buff *skb, struct net_device *dev)
> +{
> +	struct team *team = netdev_priv(dev);
> +	bool tx_success = false;
> +	unsigned int len = skb->len;
> +
> +	/*
> +	 * Ensure transmit function is called only in case there is at least
> +	 * one port present.
> +	 */
> +	if (likely(!list_empty(&team->port_list) && team->mode_ops.transmit))
> +		tx_success = team->mode_ops.transmit(team, skb);
> +	if (tx_success) {
> +		struct team_pcpu_stats *pcpu_stats;
> +
> +		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
> +		u64_stats_update_begin(&pcpu_stats->syncp);
> +		pcpu_stats->tx_packets++;
> +		pcpu_stats->tx_bytes += len;
> +		u64_stats_update_end(&pcpu_stats->syncp);
> +	} else {
> +		this_cpu_inc(team->pcpu_stats->tx_dropped);
> +	}
> +
> +	return NETDEV_TX_OK;
> +}
> +
> +static void team_change_rx_flags(struct net_device *dev, int change)
> +{
> +	struct team *team = netdev_priv(dev);
> +	struct team_port *port;
> +	int inc;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(port, &team->port_list, list) {
> +		if (change & IFF_PROMISC) {
> +			inc = dev->flags & IFF_PROMISC ? 1 : -1;
> +			dev_set_promiscuity(port->dev, inc);
> +		}
> +		if (change & IFF_ALLMULTI) {
> +			inc = dev->flags & IFF_ALLMULTI ? 1 : -1;
> +			dev_set_allmulti(port->dev, inc);
> +		}
> +	}
> +	rcu_read_unlock();
> +}
> +
> +static void team_set_rx_mode(struct net_device *dev)
> +{
> +	struct team *team = netdev_priv(dev);
> +	struct team_port *port;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(port, &team->port_list, list) {
> +		dev_uc_sync(port->dev, dev);
> +		dev_mc_sync(port->dev, dev);
> +	}
> +	rcu_read_unlock();
> +}
> +
> +static int team_set_mac_address(struct net_device *dev, void *p)
> +{
> +	struct team *team = netdev_priv(dev);
> +	struct team_port *port;
> +	struct sockaddr *addr = p;
> +
> +	memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(port, &team->port_list, list)
> +		if (team->mode_ops.port_change_mac)
> +			team->mode_ops.port_change_mac(team, port);
> +	rcu_read_unlock();
> +	return 0;
> +}
> +
> +static int team_change_mtu(struct net_device *dev, int new_mtu)
> +{
> +	struct team *team = netdev_priv(dev);
> +	struct team_port *port;
> +	int err;
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(port, &team->port_list, list) {
> +		err = dev_set_mtu(port->dev, new_mtu);
> +		if (err) {
> +			netdev_err(dev, "Device %s failed to change mtu",
> +				   port->dev->name);
> +			goto unwind;
> +		}
> +	}
> +	rcu_read_unlock();
> +
> +	dev->mtu = new_mtu;
> +
> +	return 0;
> +
> +unwind:
> +	list_for_each_entry_continue_reverse(port, &team->port_list, list)
> +		dev_set_mtu(port->dev, dev->mtu);

It may be worth noting that backwards list traversal is not rcu safe.
Under rcu_read_lock() list elements will not be freed but the list may
be modified. Moreover, list_del_rcu() poisons ->prev pointers.

In this case it doesn't really matter though. As Eric pointed out
previously, we are under rtnl protection and no rcu is needed. Perhaps
all the extra rcu locking should be removed?

-Ben

> +
> +	rcu_read_unlock();
> +	return err;
> +}
> +
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ