netdev - Re: [PATCH v3 net-next 2/4] net: dsa: Link aggregation support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201207232622.GA2475764@lunn.ch>
Date:   Tue, 8 Dec 2020 00:26:22 +0100
From:   Andrew Lunn <andrew@...n.ch>
To:     Tobias Waldekranz <tobias@...dekranz.com>
Cc:     davem@...emloft.net, kuba@...nel.org, vivien.didelot@...il.com,
        f.fainelli@...il.com, olteanv@...il.com, j.vosburgh@...il.com,
        vfalico@...il.com, andy@...yhouse.net, netdev@...r.kernel.org
Subject: Re: [PATCH v3 net-next 2/4] net: dsa: Link aggregation support

On Mon, Dec 07, 2020 at 10:19:47PM +0100, Tobias Waldekranz wrote:
> On Fri, Dec 04, 2020 at 03:20, Andrew Lunn <andrew@...n.ch> wrote:
> >> +static int dsa_tree_setup_lags(struct dsa_switch_tree *dst)
> >> +{
> >> +	struct dsa_port *dp;
> >> +	unsigned int num;
> >> +
> >> +	list_for_each_entry(dp, &dst->ports, list)
> >> +		num = dp->ds->num_lags;
> >> +
> >> +	list_for_each_entry(dp, &dst->ports, list)
> >> +		num = min(num, dp->ds->num_lags);
> >
> > Do you really need to loop over the list twice? Cannot num be
> > initialised to UINT_MAX and then just do the second loop.
> 
> I was mostly paranoid about the case where, for some reason, the list of
> ports was empty due to an invalid DT or something. But I now see that
> since num is not initialized, that would not have helped.
> 
> So, is my paranoia valid, i.e. fix is `unsigned int num = 0`? Or can
> that never happen, i.e. fix is to initialize to UINT_MAX and remove
> first loop?

I would probably initialize to UINT_MAX, and do a WARN_ON(num ==
UINT_MAX) afterwards. I don't think the DT parsing code prevents a
setup with no ports, so it could happen.

> >> +static inline bool dsa_port_can_offload(struct dsa_port *dp,
> >> +					struct net_device *dev)
> >
> > That name is a bit generic. We have a number of different offloads.
> > The mv88E6060 cannot offload anything!
> 
> The name is intentionally generic as it answers the question "can this
> dp offload requests for this netdev?"

I think it is a bit more specific. Mirroring is an offload, but is not
taken into account here, and does mirroring setup call this to see if
mirroring can be offloaded? The hardware has rate control traffic
shaping which we might sometime add support for via TC. That again is
an offload.

> >> +{
> >> +	/* Switchdev offloading can be configured on: */
> >> +
> >> +	if (dev == dp->slave)
> >> +		/* DSA ports directly connected to a bridge. */
> >> +		return true;
> 
> This condition is the normal case of a bridged port, i.e. no LAG
> involved.
> 
> >> +	if (dp->lag && dev == rtnl_dereference(dp->lag->dev))
> >> +		/* DSA ports connected to a bridge via a LAG */
> >> +		return true;
> 
> And then the indirect case of a bridged port under a LAG.
> 
> I am happy to take requests for a better name though.

There probably needs to be lag in the name.

> 
> >> +	return false;
> >> +}
> >
> >> +static void dsa_lag_put(struct dsa_switch_tree *dst, struct dsa_lag *lag)
> >> +{
> >> +	if (!refcount_dec_and_test(&lag->refcount))
> >> +		return;
> >> +
> >> +	clear_bit(lag->id, dst->lags.busy);
> >> +	WRITE_ONCE(lag->dev, NULL);
> >> +	memset(lag, 0, sizeof(*lag));
> >> +}
> >
> > I don't know what the locking is here, but wouldn't it be safer to
> > clear the bit last, after the memset and WRITE_ONCE.
> 
> All writers of dst->lags.busy are serialized with respect to dsa_lag_put
> (on rtnl_lock), and concurrent readers (dsa_lag_dev_by_id) start by
> checking busy before reading lag->dev. To my understanding, WRITE_ONCE
> would insert the proper fence to make sure busy was cleared before
> clearing dev?

I was thinking about the lag being freed and then immediately
reused. So it sounds the RTNL allows the WRITE_ONCE and memset to
happen before the next user comes along. So this is O.K.  But maybe
you can document your locking design?

       Andrew