[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251214182449.3900190-2-vladimir.oltean@nxp.com>
Date: Sun, 14 Dec 2025 20:24:49 +0200
From: Vladimir Oltean <vladimir.oltean@....com>
To: netdev@...r.kernel.org
Cc: Andrew Lunn <andrew@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Ma Ke <make24@...as.ac.cn>,
Jonas Gorski <jonas.gorski@...il.com>,
Simon Horman <horms@...nel.org>,
Florian Fainelli <f.fainelli@...il.com>
Subject: [PATCH net 2/2] net: dsa: fix missing put_device() in dsa_tree_find_first_conduit()
of_find_net_device_by_node() searches net devices by their /sys/class/net/,
entry. It is documented in its kernel-doc that:
* If successful, returns a pointer to the net_device with the embedded
* struct device refcount incremented by one, or NULL on failure. The
* refcount must be dropped when done with the net_device.
We are missing a put_device(&conduit->dev) which we could place at the
end of dsa_tree_find_first_conduit(). But to explain why calling
put_device() right away is safe is the same as to explain why the chosen
solution is different.
The code is very poorly split: dsa_tree_find_first_conduit() was first
introduced in commit 95f510d0b792 ("net: dsa: allow the DSA master to be
seen and changed through rtnetlink") but was first used several commits
later, in commit acc43b7bf52a ("net: dsa: allow masters to join a LAG").
Assume there is a switch with 2 CPU ports and 2 conduits, eno2 and eno3.
When we create a LAG (bonding or team device) and place eno2 and eno3
beneath it, we create a 3rd conduit (the LAG device itself), but this is
slightly different than the first two.
Namely, the cpu_dp->conduit pointer of the CPU ports does not change,
and remains pointing towards the physical Ethernet controllers which are
now LAG ports. Only 2 things change:
- the LAG device has a dev->dsa_ptr which marks it as a DSA conduit
- dsa_port_to_conduit(user port) finds the LAG and not the physical
conduit, because of the dp->cpu_port_in_lag bit being set.
When the LAG device is destroyed, dsa_tree_migrate_ports_from_lag_conduit()
is called and this is where dsa_tree_find_first_conduit() kicks in.
This is the logical mistake and the reason why introducing code in one
patch and using it from another is bad practice. I didn't realize that I
don't have to call of_find_net_device_by_node() again; the cpu_dp->conduit
association was never undone, and is still available for direct (re)use.
There's only one concern - maybe the conduit disappeared in the
meantime, but the netdev_hold() call we made during dsa_port_parse_cpu()
(see previous change) ensures that this was not the case.
Therefore, fixing the code means reimplementing it in the simplest way.
I am blaming the time of use, since this is what "git blame" would show
if we were to monitor for the conduit's kobject's refcount remaining
elevated instead of being freed.
Tested on the NXP LS1028A, using the steps from
Documentation/networking/dsa/configuration.rst section "Affinity of user
ports to CPU ports", followed by (extra prints added by me):
$ ip link del bond0
mscc_felix 0000:00:00.5 swp3: Link is Down
bond0 (unregistering): (slave eno2): Releasing backup interface
fsl_enetc 0000:00:00.2 eno2: Link is Down
mscc_felix 0000:00:00.5 swp0: bond0 disappeared, migrating to eno2
mscc_felix 0000:00:00.5 swp1: bond0 disappeared, migrating to eno2
mscc_felix 0000:00:00.5 swp2: bond0 disappeared, migrating to eno2
mscc_felix 0000:00:00.5 swp3: bond0 disappeared, migrating to eno2
Fixes: acc43b7bf52a ("net: dsa: allow masters to join a LAG")
Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
net/dsa/dsa.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index ac7900113d2b..e3b233060242 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -367,16 +367,10 @@ static struct dsa_port *dsa_tree_find_first_cpu(struct dsa_switch_tree *dst)
struct net_device *dsa_tree_find_first_conduit(struct dsa_switch_tree *dst)
{
- struct device_node *ethernet;
- struct net_device *conduit;
struct dsa_port *cpu_dp;
cpu_dp = dsa_tree_find_first_cpu(dst);
- ethernet = of_parse_phandle(cpu_dp->dn, "ethernet", 0);
- conduit = of_find_net_device_by_node(ethernet);
- of_node_put(ethernet);
-
- return conduit;
+ return cpu_dp->conduit;
}
/* Assign the default CPU port (the first one in the tree) to all ports of the
--
2.43.0
Powered by blists - more mailing lists