[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250403214437.fayvje56af3rbfrl@skbuf>
Date: Fri, 4 Apr 2025 00:44:37 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: David Oberhollenzer <david.oberhollenzer@...ma-star.at>
Cc: netdev@...r.kernel.org, andrew@...n.ch, Julian.FRIEDRICH@...quentis.com,
f.fainelli@...il.com, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com, linux-kernel@...r.kernel.org,
upstream+netdev@...ma-star.at
Subject: Re: [PATCH v4] net: dsa: mv88e6xxx: propperly shutdown PPU re-enable
timer on destroy
Hi David,
On Tue, Apr 01, 2025 at 03:56:37PM +0200, David Oberhollenzer wrote:
> The mv88e6xxx has an internal PPU that polls PHY state. If we want to
> access the internal PHYs, we need to disable the PPU first. Because
> that is a slow operation, a 10ms timer is used to re-enable it,
> canceled with every access, so bulk operations effectively only
> disable it once and re-enable it some 10ms after the last access.
>
> If a PHY is accessed and then the mv88e6xxx module is removed before
> the 10ms are up, the PPU re-enable ends up accessing a dangling pointer.
>
> This especially affects probing during bootup. The MDIO bus and PHY
> registration may succeed, but registration with the DSA framework
> may fail later on (e.g. because the CPU port depends on another,
> very slow device that isn't done probing yet, returning -EPROBE_DEFER).
> In this case, probe() fails, but the MDIO subsystem may already have
> accessed the MIDO bus or PHYs, arming the timer.
>
> This is fixed as follows:
> - If probe fails after mv88e6xxx_phy_init(), make sure we also call
> mv88e6xxx_phy_destroy() before returning
> - In mv88e6xxx_remove(), make sure we do the teardown in the correct
> order, calling mv88e6xxx_phy_destroy() after unregistering the
> switch device.
> - In mv88e6xxx_phy_destroy(), destroy both the timer and the work item
> that the timer might schedule, synchronously waiting in case one of
> the callbacks already fired and destroying the timer first, before
> waiting for the work item.
> - Access to the PPU is guarded by a mutex, the worker acquires it
> with a mutex_trylock(), not proceeding with the expensive shutdown
> if that fails. We grab the mutex in mv88e6xxx_phy_destroy() to make
> sure the slow PPU shutdown is already done or won't even enter, when
> we wait for the work item.
>
> Fixes: 2e5f032095ff ("dsa: add support for the Marvell 88E6131 switch chip")
> Signed-off-by: David Oberhollenzer <david.oberhollenzer@...ma-star.at>
> ---
Reviewed-by: Vladimir Oltean <olteanv@...il.com>
Powered by blists - more mailing lists