[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGPba6fX1bqgVfYC@wunner.de>
Date: Tue, 1 Jul 2025 14:58:19 +0200
From: Lukas Wunner <lukas@...ner.de>
To: Oleksij Rempel <o.rempel@...gutronix.de>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
kernel@...gutronix.de, linux-kernel@...r.kernel.org,
Russell King <linux@...linux.org.uk>, netdev@...r.kernel.org,
Andre Edich <andre.edich@...rochip.com>
Subject: Re: [PATCH net v1 4/4] net: phy: smsc: Disable IRQ support to
prevent link state corruption
On Tue, Jul 01, 2025 at 02:21:46PM +0200, Oleksij Rempel wrote:
> Disable interrupt handling for the LAN87xx PHY to prevent the network
> interface from entering a corrupted state after rapid configuration
> changes.
>
> When the link configuration is changed quickly, the PHY can get stuck in
> a non-functional state. In this state, 'ethtool' reports that a link is
> present, but 'ip link' shows NO-CARRIER, and the interface is unable to
> transfer data.
[...]
> --- a/drivers/net/phy/smsc.c
> +++ b/drivers/net/phy/smsc.c
> @@ -746,10 +746,6 @@ static struct phy_driver smsc_phy_driver[] = {
> .soft_reset = smsc_phy_reset,
> .config_aneg = lan87xx_config_aneg,
>
> - /* IRQ related */
> - .config_intr = smsc_phy_config_intr,
> - .handle_interrupt = smsc_phy_handle_interrupt,
> -
Well, that's not good. I guess this means that the interrupt is
polled again, so we basically go back to the suboptimal behavior
prior to 1ce8b37241ed?
Without support for interrupt handling, we can't take advantage
of the GPIOs on the chip for interrupt generation. Nor can we
properly support runtime PM if no cable is attached.
What's the actual root cause? Is it the issue described in this
paragraph of 1ce8b37241ed's commit message?
Normally the PHY interrupt should be masked until the PHY driver has
cleared it. However masking requires a (sleeping) USB transaction and
interrupts are received in (non-sleepable) softirq context. I decided
not to mask the interrupt at all (by using the dummy_irq_chip's noop
->irq_mask() callback): The USB interrupt endpoint is polled in 1 msec
intervals and normally that's sufficient to wake the PHY driver's IRQ
thread and have it clear the interrupt. If it does take longer, worst
thing that can happen is the IRQ thread is woken again. No big deal.
There must be better options than going back to polling.
E.g. inserting delays to avoid the PHY getting wedged.
TBH I did test this thoroughly back in the day and never
witnessed the issue.
Thanks,
Lukas
Powered by blists - more mailing lists