lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220519190841.GA30869@wunner.de>
Date:   Thu, 19 May 2022 21:08:41 +0200
From:   Lukas Wunner <lukas@...ner.de>
To:     Marek Szyprowski <m.szyprowski@...sung.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Eric Dumazet <edumazet@...gle.com>, netdev@...r.kernel.org,
        linux-usb@...r.kernel.org,
        Steve Glendinning <steve.glendinning@...well.net>,
        UNGLinuxDriver@...rochip.com, Oliver Neukum <oneukum@...e.com>,
        Andre Edich <andre.edich@...rochip.com>,
        Oleksij Rempel <linux@...pel-privat.de>,
        Martyn Welch <martyn.welch@...labora.com>,
        Gabriel Hojda <ghojda@...urs.ro>,
        Christoph Fritz <chf.fritz@...glemail.com>,
        Lino Sanfilippo <LinoSanfilippo@....de>,
        Philipp Rosenberger <p.rosenberger@...bus.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        Russell King <linux@...linux.org.uk>,
        Ferry Toth <fntoth@...il.com>,
        Krzysztof Kozlowski <krzk@...nel.org>,
        'Linux Samsung SOC' <linux-samsung-soc@...r.kernel.org>
Subject: Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts
 to PHY driver to avoid polling

On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
> This patch landed in the recent linux next-20220516 as commit 
> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to 
> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet operation 
> after system suspend-resume cycle. On the Odroid XU3 board I got the 
> following warning in the kernel log:
> 
> # time rtcwake -s10 -mmem
> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
> PM: suspend entry (deep)
> Filesystems sync: 0.001 seconds
> Freezing user space processes ... (elapsed 0.002 seconds) done.
> OOM killer disabled.
> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> printk: Suspending console(s) (use no_console_suspend to debug)
> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
> phy_state_machine+0x98/0x28c
[...]
> It looks that the driver's suspend/resume operations might need some 
> adjustments. After the system suspend/resume cycle the driver is not 
> operational anymore. Reverting the $subject patch on top of linux 
> next-20220516 restores ethernet operation after system suspend/resume.

Thanks a lot for the report.  It seems the PHY is signaling a link change
shortly before system sleep and by the time the phy_state_machine() worker
gets around to handle it, the device has already been suspended and thus
refuses any further USB requests with -EHOSTUNREACH (-113):

usb_suspend_both()
  usb_suspend_interface()
    smsc95xx_suspend()
      usbnet_suspend()
        __usbnet_status_stop_force() # stops interrupt polling,
                                     # link change is signaled before this

  udev->can_submit = 0               # refuse further URBs

Assuming the above theory is correct, calling phy_stop_machine()
after usbnet_suspend() would be sufficient to fix the issue.
It cancels the phy_state_machine() worker.

The small patch below does that.  Could you give it a spin?

Taking a step back though, I'm wondering if there's a bigger problem here:
This is a USB device, so we stop receiving interrupts once the Interrupt
Endpoint is no longer polled.  But what if a PHY's interrupt is attached
to a GPIO of the SoC and that interrupt is raised while the system is
suspending?  The interrupt handler may likewise try to reach an
inaccessible (suspended) device.

The right thing to do would probably be to signal wakeup.  But the
PHY drivers' irq handlers instead schedule the phy_state_machine().
Perhaps we need something like the following at the top of
phy_state_machine():

	if (phydev->suspended) {
		pm_wakeup_dev_event(&phydev->mdio.dev, 0, true);
		return;
	}

However, phydev->suspended is set at the *bottom* of phy_suspend(),
it would have to be set at the *top* of mdio_bus_phy_suspend()
for the above to be correct.  Hmmm...

Thanks,

Lukas

-- >8 --
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index bd03e16..d351a6c 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -1201,6 +1201,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
 	}
 
 	pdata->phydev->irq = phy_irq;
+	pdata->phydev->mac_managed_pm = true;
 	pdata->phydev->is_internal = is_internal_phy;
 
 	/* detect device revision as different features may be available */
@@ -1496,6 +1497,9 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 		return ret;
 	}
 
+	if (netif_running(dev->net))
+		phy_stop(pdata->phydev);
+
 	if (pdata->suspend_flags) {
 		netdev_warn(dev->net, "error during last resume\n");
 		pdata->suspend_flags = 0;
@@ -1778,6 +1782,8 @@ static int smsc95xx_resume(struct usb_interface *intf)
 	}
 
 	phy_init_hw(pdata->phydev);
+	if (netif_running(dev->net))
+		phy_start(pdata->phydev);
 
 	ret = usbnet_resume(intf);
 	if (ret < 0)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ