lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGPba6fX1bqgVfYC@wunner.de>
Date: Tue, 1 Jul 2025 14:58:19 +0200
From: Lukas Wunner <lukas@...ner.de>
To: Oleksij Rempel <o.rempel@...gutronix.de>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	kernel@...gutronix.de, linux-kernel@...r.kernel.org,
	Russell King <linux@...linux.org.uk>, netdev@...r.kernel.org,
	Andre Edich <andre.edich@...rochip.com>
Subject: Re: [PATCH net v1 4/4] net: phy: smsc: Disable IRQ support to
 prevent link state corruption

On Tue, Jul 01, 2025 at 02:21:46PM +0200, Oleksij Rempel wrote:
> Disable interrupt handling for the LAN87xx PHY to prevent the network
> interface from entering a corrupted state after rapid configuration
> changes.
> 
> When the link configuration is changed quickly, the PHY can get stuck in
> a non-functional state. In this state, 'ethtool' reports that a link is
> present, but 'ip link' shows NO-CARRIER, and the interface is unable to
> transfer data.
[...]
> --- a/drivers/net/phy/smsc.c
> +++ b/drivers/net/phy/smsc.c
> @@ -746,10 +746,6 @@ static struct phy_driver smsc_phy_driver[] = {
>  	.soft_reset	= smsc_phy_reset,
>  	.config_aneg	= lan87xx_config_aneg,
>  
> -	/* IRQ related */
> -	.config_intr	= smsc_phy_config_intr,
> -	.handle_interrupt = smsc_phy_handle_interrupt,
> -

Well, that's not good.  I guess this means that the interrupt is
polled again, so we basically go back to the suboptimal behavior
prior to 1ce8b37241ed?

Without support for interrupt handling, we can't take advantage
of the GPIOs on the chip for interrupt generation.  Nor can we
properly support runtime PM if no cable is attached.

What's the actual root cause?  Is it the issue described in this
paragraph of 1ce8b37241ed's commit message?

    Normally the PHY interrupt should be masked until the PHY driver has
    cleared it.  However masking requires a (sleeping) USB transaction and
    interrupts are received in (non-sleepable) softirq context.  I decided
    not to mask the interrupt at all (by using the dummy_irq_chip's noop
    ->irq_mask() callback):  The USB interrupt endpoint is polled in 1 msec
    intervals and normally that's sufficient to wake the PHY driver's IRQ
    thread and have it clear the interrupt.  If it does take longer, worst
    thing that can happen is the IRQ thread is woken again.  No big deal.

There must be better options than going back to polling.
E.g. inserting delays to avoid the PHY getting wedged.

TBH I did test this thoroughly back in the day and never
witnessed the issue.

Thanks,

Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ