lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 1 Dec 2020 03:37:06 +0200
From:   Vladimir Oltean <olteanv@...il.com>
To:     Tobias Waldekranz <tobias@...dekranz.com>
Cc:     davem@...emloft.net, kuba@...nel.org, andrew@...n.ch,
        vivien.didelot@...il.com, f.fainelli@...il.com,
        j.vosburgh@...il.com, vfalico@...il.com, andy@...yhouse.net,
        netdev@...r.kernel.org
Subject: Re: [PATCH v2 net-next 2/4] net: dsa: Link aggregation support

Hi Tobias,

On Mon, Nov 30, 2020 at 03:06:08PM +0100, Tobias Waldekranz wrote:
> Monitor the following events and notify the driver when:
> 
> - A DSA port joins/leaves a LAG.
> - A LAG, made up of DSA ports, joins/leaves a bridge.
> - A DSA port in a LAG is enabled/disabled (enabled meaning
>   "distributing" in 802.3ad LACP terms).
> 
> Each LAG interface to which a DSA port is attached is represented by a
> `struct dsa_lag` which is globally reachable from the switch tree and
> from each associated port.
> 
> When a LAG joins a bridge, the DSA subsystem will treat that as each
> individual port joining the bridge. The driver may look at the port's
> LAG pointer to see if it is associated with any LAG, if that is
> required. This is analogue to how switchdev events are replicated out
> to all lower devices when reaching e.g. a LAG.
> 
> Signed-off-by: Tobias Waldekranz <tobias@...dekranz.com>
> ---
>  include/net/dsa.h  |  97 +++++++++++++++++++++++++++++++
>  net/dsa/dsa2.c     |  51 ++++++++++++++++
>  net/dsa/dsa_priv.h |  31 ++++++++++
>  net/dsa/port.c     | 141 +++++++++++++++++++++++++++++++++++++++++++++
>  net/dsa/slave.c    |  83 ++++++++++++++++++++++++--
>  net/dsa/switch.c   |  49 ++++++++++++++++
>  6 files changed, 446 insertions(+), 6 deletions(-)
> 
> +static inline struct dsa_lag *dsa_lag_by_id(struct dsa_switch_tree *dst, int id)
> +{
> +	if (!test_bit(id, dst->lags.busy))
> +		return NULL;
> +
> +	return &dst->lags.pool[id];
> +}
> +
> +static inline struct net_device *dsa_lag_dev_by_id(struct dsa_switch_tree *dst,
> +						   int id)
> +{
> +	struct dsa_lag *lag = dsa_lag_by_id(dst, id);
> +
> +	return lag ? rcu_dereference(lag->dev) : NULL;
> +}
> +
> +static inline struct dsa_lag *dsa_lag_by_dev(struct dsa_switch_tree *dst,
> +					     struct net_device *dev)
> +{
> +	struct dsa_lag *lag;
> +	int id;
> +
> +	dsa_lag_foreach(id, dst) {
> +		lag = dsa_lag_by_id(dst, id);
> +
> +		if (rtnl_dereference(lag->dev) == dev)
> +			return lag;
> +	}
> +
> +	return NULL;
> +}
>  
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index 73569c9af3cc..c2332ee5f5c7 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -193,6 +193,147 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
>  	dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
>  }
>  
> +static struct dsa_lag *dsa_lag_get(struct dsa_switch_tree *dst,
> +				   struct net_device *dev)
> +{
> +	struct dsa_lag *lag;
> +	int id;
> +
> +	lag = dsa_lag_by_dev(dst, dev);
> +	if (lag) {
> +		kref_get(&lag->refcount);
> +		return lag;
> +	}
> +
> +	id = find_first_zero_bit(dst->lags.busy, dst->lags.num);
> +	if (id >= dst->lags.num) {
> +		WARN(1, "No LAGs available");
> +		return NULL;
> +	}
> +
> +	set_bit(id, dst->lags.busy);
> +
> +	lag = &dst->lags.pool[id];
> +	kref_init(&lag->refcount);
> +	lag->id = id;
> +	INIT_LIST_HEAD(&lag->ports);
> +	INIT_LIST_HEAD(&lag->tx_ports);
> +
> +	rcu_assign_pointer(lag->dev, dev);
> +	return lag;
> +}
> +
> +static void dsa_lag_release(struct kref *refcount)
> +{
> +	struct dsa_lag *lag = container_of(refcount, struct dsa_lag, refcount);
> +
> +	rcu_assign_pointer(lag->dev, NULL);
> +	synchronize_rcu();
> +	memset(lag, 0, sizeof(*lag));
> +}

What difference does it make if lag->dev is set to NULL right away or
after a grace period? Squeezing one last packet from that bonding interface?
Pointer updates are atomic operations on all architectures that the
kernel supports, and, as long as you use WRITE_ONCE and READ_ONCE memory
barriers, there should be no reason for RCU protection that I can see.
And unlike typical uses of RCU, you do not free lag->dev, because you do
not own lag->dev. Instead, the bonding interface pointed to by lag->dev
is going to be freed (in case of a deletion using ip link) after an RCU
grace period anyway. And the receive data path is under an RCU read-side
critical section anyway. So even if you set lag->dev to NULL using
WRITE_ONCE, the existing in-flight readers from the RX data path that
had called dsa_lag_dev_by_id() will still hold a reference to a valid
bonding interface.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ