[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20260205192344.21797-1-vladimir.oltean@nxp.com>
Date: Thu, 5 Feb 2026 21:23:44 +0200
From: Vladimir Oltean <vladimir.oltean@....com>
To: netdev@...r.kernel.org
Cc: Andrew Lunn <andrew@...n.ch>,
Heiner Kallweit <hkallweit1@...il.com>,
Russell King <linux@...linux.org.uk>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
linux-kernel@...r.kernel.org
Subject: [PATCH net-next] net: phylink: guard link replay helpers against NULL phylink instance
There is a crash when unbinding the sja1105 driver under special
circumstances:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
Call trace:
phylink_run_resolve_and_disable+0x10/0x90
sja1105_static_config_reload+0xc0/0x410
sja1105_vlan_filtering+0x100/0x140
dsa_port_vlan_filtering+0x13c/0x368
dsa_port_reset_vlan_filtering.isra.0+0xe8/0x198
dsa_port_bridge_leave+0x130/0x248
dsa_user_changeupper.part.0+0x74/0x158
dsa_user_netdevice_event+0x50c/0xa50
notifier_call_chain+0x78/0x148
raw_notifier_call_chain+0x20/0x38
call_netdevice_notifiers_info+0x58/0xa8
__netdev_upper_dev_unlink+0xac/0x220
netdev_upper_dev_unlink+0x38/0x70
del_nbp+0x1a4/0x320
br_del_if+0x3c/0xd8
br_device_event+0xf8/0x2d8
notifier_call_chain+0x78/0x148
raw_notifier_call_chain+0x20/0x38
call_netdevice_notifiers_info+0x58/0xa8
unregister_netdevice_many_notify+0x314/0x848
unregister_netdevice_queue+0xe8/0xf8
dsa_user_destroy+0x50/0xa8
dsa_port_teardown+0x80/0x98
dsa_switch_teardown_ports+0x4c/0xb8
dsa_switch_deinit+0x94/0xb8
dsa_switch_put_tree+0x2c/0xc0
dsa_unregister_switch+0x38/0x60
sja1105_remove+0x24/0x40
spi_remove+0x38/0x60
device_remove+0x54/0x90
device_release_driver_internal+0x1d4/0x230
device_driver_detach+0x20/0x38
unbind_store+0xbc/0xc8
---[ end trace 0000000000000000 ]---
which requires an explanation.
When a port offloads a bridge, the switch must be reset to change
the VLAN awareness state (the SJA1105_VLAN_FILTERING reason for
sja1105_static_config_reload()). When the port leaves a VLAN-aware
bridge, it must also be reset for the same reason: it is returning
to operation as a VLAN-unaware standalone port.
sja1105_static_config_reload() triggers the phylink link replay helpers.
Because sja1105 is a switch, it has multiple user ports. During unbind,
ports are torn down one by one in dsa_switch_teardown_ports() ->
dsa_port_teardown() -> dsa_user_destroy().
The crash happens when the numerically first user port is not part of
the VLAN-aware bridge, but any other user port is.
Tearing down the first user port causes phylink_destroy() to be called
on dp->pl, and this pointer to be set to NULL. Then, when the second
user port is torn down, this was offloading a VLAN-aware bridge port, so
indirectly it will trigger sja1105_static_config_reload().
The latter function iterates using dsa_switch_for_each_available_port(),
and unconditionally dereferences dp->pl, including for the
aforementioned torn down previous port, and passes that to phylink.
This is where the NULL pointer is coming from.
There are multiple levels at which this could be avoided:
- add an "if (dp->pl)" in sja1105_static_config_reload()
- make the phylink replay helpers NULL-tolerant
- mark ports as DSA_PORT_TYPE_UNUSED after dsa_port_phylink_destroy()
has run, such that subsequent dsa_switch_for_each_available_port()
iterations skip them
- disconnect the entire switch at once from switchdev and
NETDEV_CHANGEUPPER events while unbinding, not just port by port,
likely using a "ds->unbinding = true" mechanism or similar
however options 3 and 4 are quite heavy and might have side effects,
option 1 is very unassuming and option 2 seems a more elegant variant
of 1, given the fact that sja1105 is the only user of these phylink
replay helpers. It allows to keep the driver simple and is the option
I went with.
Functionally speaking, transforming the replay helpers into no-ops for
ports without a phylink instance is fine, because that only happens
during driver removal (an operation which cannot be cancelled). The
ports are not required to work.
Fixes: 0b2edc531e0b ("net: dsa: sja1105: let phylink help with the replay of link callbacks")
Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
drivers/net/phy/phylink.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index e1f01d7fc4da..c09c357d0fbd 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -4379,6 +4379,9 @@ void phylink_replay_link_begin(struct phylink *pl)
{
ASSERT_RTNL();
+ if (!pl)
+ return;
+
phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_REPLAY);
}
EXPORT_SYMBOL_GPL(phylink_replay_link_begin);
@@ -4402,6 +4405,9 @@ void phylink_replay_link_end(struct phylink *pl)
{
ASSERT_RTNL();
+ if (!pl)
+ return;
+
if (WARN(!test_bit(PHYLINK_DISABLE_REPLAY,
&pl->phylink_disable_state),
"phylink_replay_link_end() called without a prior phylink_replay_link_begin()\n"))
--
2.34.1
Powered by blists - more mailing lists