[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YwaqZ1+zm78vl4L1@sirena.org.uk>
Date: Wed, 24 Aug 2022 23:47:03 +0100
From: Mark Brown <broonie@...nel.org>
To: Oleksij Rempel <o.rempel@...gutronix.de>,
Ferry Toth <fntoth@...il.com>, Lukas Wunner <lukas@...ner.de>,
Andrew Lunn <andrew@...n.ch>,
Andre Edich <andre.edich@...rochip.com>,
"David S. Miller" <davem@...emloft.net>,
Sasha Levin <sashal@...nel.org>
Cc: kernelci-results@...ups.io, bot@...nelci.org,
gtucker@...labora.com, stable@...r.kernel.org,
netdev@...r.kernel.org, linux-omap@...r.kernel.org
Subject: Re: stable-rc/linux-5.18.y bisection: baseline.login on panda
On Wed, Aug 24, 2022 at 01:22:12PM -0700, KernelCI bot wrote:
The KernelCI bisection bot identified 7eea9a60703ca ("usbnet: smsc95xx:
Forward PHY interrupts to PHY driver to avoid polling") as causing a
boot regression on Panda in v5.18. The board stops detecting a link
which breks NFS boot:
<6>[ 4.953613] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-4a064c00.ehci-1.1, smsc95xx USB 2.0 Ethernet, 02:03:01:8c:13:b0
<6>[ 5.032928] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
<6>[ 5.044036] smsc95xx 2-1.1:1.0 eth0: Link is Down
<6>[ 25.053924] Waiting up to 100 more seconds for network.
<6>[ 45.074005] Waiting up to 80 more seconds for network.
<6>[ 65.093933] Waiting up to 60 more seconds for network.
<6>[ 85.123992] Waiting up to 40 more seconds for network.
<6>[ 105.143951] Waiting up to 20 more seconds for network.
<5>[ 125.084014] Sending DHCP requests ...... timed out!
<6>[ 207.765808] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
<3>[ 207.773468] IP-Config: Reopening network devices...
<6>[ 207.840332] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
<6>[ 207.851226] smsc95xx 2-1.1:1.0 eth0: Link is Down
<6>[ 227.873931] Waiting up to 100 more seconds for network.
<6>[ 247.873931] Waiting up to 80 more seconds for network.
We're seen other failures on this board in mainline which stop boot
sooner so can't confirm if there's a similar issue there.
I've left the whole report, including links to more details like full
logs and a tag for the bot below:
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis *
> * that you may be involved with the breaking commit it has *
> * found. No manual investigation has been done to verify it, *
> * and the root cause of the problem may be somewhere else. *
> * *
> * If you do send a fix, please include this trailer: *
> * Reported-by: "kernelci.org bot" <bot@...nelci.org> *
> * *
> * Hope this helps! *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
>
> stable-rc/linux-5.18.y bisection: baseline.login on panda
>
> Summary:
> Start: 22a992953741a Linux 5.18.19
> Plain log: https://storage.kernelci.org/stable-rc/linux-5.18.y/v5.18.19/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-panda.txt
> HTML log: https://storage.kernelci.org/stable-rc/linux-5.18.y/v5.18.19/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-panda.html
> Result: 7eea9a60703ca usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
>
> Checks:
> revert: PASS
> verify: PASS
>
> Parameters:
> Tree: stable-rc
> URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Branch: linux-5.18.y
> Target: panda
> CPU arch: arm
> Lab: lab-baylibre
> Compiler: gcc-10
> Config: multi_v7_defconfig
> Test case: baseline.login
>
> Breaking commit found:
>
> -------------------------------------------------------------------------------
> commit 7eea9a60703caf1b04eccba93cd103f1c8fc6809
> Author: Lukas Wunner <lukas@...ner.de>
> Date: Thu May 12 10:42:05 2022 +0200
>
> usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
>
> [ Upstream commit 1ce8b37241ed291af56f7a49bbdbf20c08728e88 ]
>
> Link status of SMSC LAN95xx chips is polled once per second, even though
> they're capable of signaling PHY interrupts through the MAC layer.
>
> Forward those interrupts to the PHY driver to avoid polling. Benefits
> are reduced bus traffic, reduced CPU overhead and quicker interface
> bringup.
>
> Polling was introduced in 2016 by commit d69d16949346 ("usbnet:
> smsc95xx: fix link detection for disabled autonegotiation").
> Back then, the LAN95xx driver neglected to enable the ENERGYON interrupt,
> hence couldn't detect link-up events when auto-negotiation was disabled.
> The proper solution would have been to enable the ENERGYON interrupt
> instead of polling.
>
> Since then, PHY handling was moved from the LAN95xx driver to the SMSC
> PHY driver with commit 05b35e7eb9a1 ("smsc95xx: add phylib support").
> That PHY driver is capable of link detection with auto-negotiation
> disabled because it enables the ENERGYON interrupt.
>
> Note that signaling interrupts through the MAC layer not only works with
> the integrated PHY, but also with an external PHY, provided its
> interrupt pin is attached to LAN95xx's nPHY_INT pin.
>
> In the unlikely event that the interrupt pin of an external PHY is
> attached to a GPIO of the SoC (or not connected at all), the driver can
> be amended to retrieve the irq from the PHY's of_node.
>
> To forward PHY interrupts to phylib, it is not sufficient to call
> phy_mac_interrupt(). Instead, the PHY's interrupt handler needs to run
> so that PHY interrupts are cleared. That's because according to page
> 119 of the LAN950x datasheet, "The source of this interrupt is a level.
> The interrupt persists until it is cleared in the PHY."
>
> https://www.microchip.com/content/dam/mchp/documents/UNG/ProductDocuments/DataSheets/LAN950x-Data-Sheet-DS00001875D.pdf
>
> Therefore, create an IRQ domain with a single IRQ for the PHY. In the
> future, the IRQ domain may be extended to support the 11 GPIOs on the
> LAN95xx.
>
> Normally the PHY interrupt should be masked until the PHY driver has
> cleared it. However masking requires a (sleeping) USB transaction and
> interrupts are received in (non-sleepable) softirq context. I decided
> not to mask the interrupt at all (by using the dummy_irq_chip's noop
> ->irq_mask() callback): The USB interrupt endpoint is polled in 1 msec
> intervals and normally that's sufficient to wake the PHY driver's IRQ
> thread and have it clear the interrupt. If it does take longer, worst
> thing that can happen is the IRQ thread is woken again. No big deal.
>
> Because PHY interrupts are now perpetually enabled, there's no need to
> selectively enable them on suspend. So remove all invocations of
> smsc95xx_enable_phy_wakeup_interrupts().
>
> In smsc95xx_resume(), move the call of phy_init_hw() before
> usbnet_resume() (which restarts the status URB) to ensure that the PHY
> is fully initialized when an interrupt is handled.
>
> Tested-by: Oleksij Rempel <o.rempel@...gutronix.de> # LAN9514/9512/9500
> Tested-by: Ferry Toth <fntoth@...il.com> # LAN9514
> Signed-off-by: Lukas Wunner <lukas@...ner.de>
> Reviewed-by: Andrew Lunn <andrew@...n.ch> # from a PHY perspective
> Cc: Andre Edich <andre.edich@...rochip.com>
> Signed-off-by: David S. Miller <davem@...emloft.net>
> Signed-off-by: Sasha Levin <sashal@...nel.org>
>
> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
> index f5a208948d22a..358b170cc8fb7 100644
> --- a/drivers/net/usb/smsc95xx.c
> +++ b/drivers/net/usb/smsc95xx.c
> @@ -18,6 +18,8 @@
> #include <linux/usb/usbnet.h>
> #include <linux/slab.h>
> #include <linux/of_net.h>
> +#include <linux/irq.h>
> +#include <linux/irqdomain.h>
> #include <linux/mdio.h>
> #include <linux/phy.h>
> #include <net/selftests.h>
> @@ -53,6 +55,9 @@
> #define SUSPEND_ALLMODES (SUSPEND_SUSPEND0 | SUSPEND_SUSPEND1 | \
> SUSPEND_SUSPEND2 | SUSPEND_SUSPEND3)
>
> +#define SMSC95XX_NR_IRQS (1) /* raise to 12 for GPIOs */
> +#define PHY_HWIRQ (SMSC95XX_NR_IRQS - 1)
> +
> struct smsc95xx_priv {
> u32 mac_cr;
> u32 hash_hi;
> @@ -61,6 +66,9 @@ struct smsc95xx_priv {
> spinlock_t mac_cr_lock;
> u8 features;
> u8 suspend_flags;
> + struct irq_chip irqchip;
> + struct irq_domain *irqdomain;
> + struct fwnode_handle *irqfwnode;
> struct mii_bus *mdiobus;
> struct phy_device *phydev;
> };
> @@ -597,6 +605,8 @@ static void smsc95xx_mac_update_fullduplex(struct usbnet *dev)
>
> static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
> {
> + struct smsc95xx_priv *pdata = dev->driver_priv;
> + unsigned long flags;
> u32 intdata;
>
> if (urb->actual_length != 4) {
> @@ -608,11 +618,15 @@ static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
> intdata = get_unaligned_le32(urb->transfer_buffer);
> netif_dbg(dev, link, dev->net, "intdata: 0x%08X\n", intdata);
>
> + local_irq_save(flags);
> +
> if (intdata & INT_ENP_PHY_INT_)
> - ;
> + generic_handle_domain_irq(pdata->irqdomain, PHY_HWIRQ);
> else
> netdev_warn(dev->net, "unexpected interrupt, intdata=0x%08X\n",
> intdata);
> +
> + local_irq_restore(flags);
> }
>
> /* Enable or disable Tx & Rx checksum offload engines */
> @@ -1098,8 +1112,9 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
> {
> struct smsc95xx_priv *pdata;
> bool is_internal_phy;
> + char usb_path[64];
> + int ret, phy_irq;
> u32 val;
> - int ret;
>
> printk(KERN_INFO SMSC_CHIPNAME " v" SMSC_DRIVER_VERSION "\n");
>
> @@ -1139,10 +1154,38 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
> if (ret)
> goto free_pdata;
>
> + /* create irq domain for use by PHY driver and GPIO consumers */
> + usb_make_path(dev->udev, usb_path, sizeof(usb_path));
> + pdata->irqfwnode = irq_domain_alloc_named_fwnode(usb_path);
> + if (!pdata->irqfwnode) {
> + ret = -ENOMEM;
> + goto free_pdata;
> + }
> +
> + pdata->irqdomain = irq_domain_create_linear(pdata->irqfwnode,
> + SMSC95XX_NR_IRQS,
> + &irq_domain_simple_ops,
> + pdata);
> + if (!pdata->irqdomain) {
> + ret = -ENOMEM;
> + goto free_irqfwnode;
> + }
> +
> + phy_irq = irq_create_mapping(pdata->irqdomain, PHY_HWIRQ);
> + if (!phy_irq) {
> + ret = -ENOENT;
> + goto remove_irqdomain;
> + }
> +
> + pdata->irqchip = dummy_irq_chip;
> + pdata->irqchip.name = SMSC_CHIPNAME;
> + irq_set_chip_and_handler_name(phy_irq, &pdata->irqchip,
> + handle_simple_irq, "phy");
> +
> pdata->mdiobus = mdiobus_alloc();
> if (!pdata->mdiobus) {
> ret = -ENOMEM;
> - goto free_pdata;
> + goto dispose_irq;
> }
>
> ret = smsc95xx_read_reg(dev, HW_CFG, &val);
> @@ -1175,6 +1218,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
> goto unregister_mdio;
> }
>
> + pdata->phydev->irq = phy_irq;
> pdata->phydev->is_internal = is_internal_phy;
>
> /* detect device revision as different features may be available */
> @@ -1217,6 +1261,15 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
> free_mdio:
> mdiobus_free(pdata->mdiobus);
>
> +dispose_irq:
> + irq_dispose_mapping(phy_irq);
> +
> +remove_irqdomain:
> + irq_domain_remove(pdata->irqdomain);
> +
> +free_irqfwnode:
> + irq_domain_free_fwnode(pdata->irqfwnode);
> +
> free_pdata:
> kfree(pdata);
> return ret;
> @@ -1229,6 +1282,9 @@ static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
> phy_disconnect(dev->net->phydev);
> mdiobus_unregister(pdata->mdiobus);
> mdiobus_free(pdata->mdiobus);
> + irq_dispose_mapping(irq_find_mapping(pdata->irqdomain, PHY_HWIRQ));
> + irq_domain_remove(pdata->irqdomain);
> + irq_domain_free_fwnode(pdata->irqfwnode);
> netif_dbg(dev, ifdown, dev->net, "free pdata\n");
> kfree(pdata);
> }
> @@ -1253,29 +1309,6 @@ static u32 smsc_crc(const u8 *buffer, size_t len, int filter)
> return crc << ((filter % 2) * 16);
> }
>
> -static int smsc95xx_enable_phy_wakeup_interrupts(struct usbnet *dev, u16 mask)
> -{
> - int ret;
> -
> - netdev_dbg(dev->net, "enabling PHY wakeup interrupts\n");
> -
> - /* read to clear */
> - ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_SRC);
> - if (ret < 0)
> - return ret;
> -
> - /* enable interrupt source */
> - ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_MASK);
> - if (ret < 0)
> - return ret;
> -
> - ret |= mask;
> -
> - smsc95xx_mdio_write_nopm(dev, PHY_INT_MASK, ret);
> -
> - return 0;
> -}
> -
> static int smsc95xx_link_ok_nopm(struct usbnet *dev)
> {
> int ret;
> @@ -1442,7 +1475,6 @@ static int smsc95xx_enter_suspend3(struct usbnet *dev)
> static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
> {
> struct smsc95xx_priv *pdata = dev->driver_priv;
> - int ret;
>
> if (!netif_running(dev->net)) {
> /* interface is ifconfig down so fully power down hw */
> @@ -1461,27 +1493,10 @@ static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
> }
>
> netdev_dbg(dev->net, "autosuspend entering SUSPEND1\n");
> -
> - /* enable PHY wakeup events for if cable is attached */
> - ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
> - PHY_INT_MASK_ANEG_COMP_);
> - if (ret < 0) {
> - netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
> - return ret;
> - }
> -
> netdev_info(dev->net, "entering SUSPEND1 mode\n");
> return smsc95xx_enter_suspend1(dev);
> }
>
> - /* enable PHY wakeup events so we remote wakeup if cable is pulled */
> - ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
> - PHY_INT_MASK_LINK_DOWN_);
> - if (ret < 0) {
> - netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
> - return ret;
> - }
> -
> netdev_dbg(dev->net, "autosuspend entering SUSPEND3\n");
> return smsc95xx_enter_suspend3(dev);
> }
> @@ -1547,13 +1562,6 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
> }
>
> if (pdata->wolopts & WAKE_PHY) {
> - ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
> - (PHY_INT_MASK_ANEG_COMP_ | PHY_INT_MASK_LINK_DOWN_));
> - if (ret < 0) {
> - netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
> - goto done;
> - }
> -
> /* if link is down then configure EDPD and enter SUSPEND1,
> * otherwise enter SUSPEND0 below
> */
> @@ -1787,11 +1795,12 @@ static int smsc95xx_resume(struct usb_interface *intf)
> return ret;
> }
>
> + phy_init_hw(pdata->phydev);
> +
> ret = usbnet_resume(intf);
> if (ret < 0)
> netdev_warn(dev->net, "usbnet_resume error\n");
>
> - phy_init_hw(pdata->phydev);
> return ret;
> }
> -------------------------------------------------------------------------------
>
>
> Git bisection log:
>
> -------------------------------------------------------------------------------
> git bisect start
> # good: [9aa5a042881d4a99657f82c774e9e15353ebeb2d] Linux 5.18.14
> git bisect good 9aa5a042881d4a99657f82c774e9e15353ebeb2d
> # bad: [22a992953741ad79c07890d3f4104585e52ef26b] Linux 5.18.19
> git bisect bad 22a992953741ad79c07890d3f4104585e52ef26b
> # good: [77d5174ed962c259965226386f71575790646ec0] drm/mediatek: dpi: Remove output format of YUV
> git bisect good 77d5174ed962c259965226386f71575790646ec0
> # good: [598bc9541e2f82a7fe8dfe23891201b355a56cda] usb: dwc3: core: Do not perform GCTL_CORE_SOFTRESET during bootup
> git bisect good 598bc9541e2f82a7fe8dfe23891201b355a56cda
> # good: [876f57cc94922896cc71dd4696013a7c0558c9b4] f2fs: give priority to select unpinned section for foreground GC
> git bisect good 876f57cc94922896cc71dd4696013a7c0558c9b4
> # bad: [5f4e505909fe50a4e256704076594ee3def0b9b1] block: add a bdev_max_zone_append_sectors helper
> git bisect bad 5f4e505909fe50a4e256704076594ee3def0b9b1
> # good: [b9c3401f7cac6ae291a16784dadcd1bf116218fe] x86/kprobes: Update kcb status flag after singlestepping
> git bisect good b9c3401f7cac6ae291a16784dadcd1bf116218fe
> # bad: [73ce2046e04ad488cecc66757c36cbe1bdf089d4] iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
> git bisect bad 73ce2046e04ad488cecc66757c36cbe1bdf089d4
> # good: [f624c94ad56b663693a9413d8c8c03635435f369] drm/vc4: drv: Adopt the dma configuration from the HVS or V3D component
> git bisect good f624c94ad56b663693a9413d8c8c03635435f369
> # bad: [87c4896d5dd7fd9927c814cf3c6289f41de3b562] firmware: arm_scpi: Ensure scpi_info is not assigned if the probe fails
> git bisect bad 87c4896d5dd7fd9927c814cf3c6289f41de3b562
> # good: [9b61971cab92a918cab7168d439a351b1c799aca] usbnet: smsc95xx: Avoid link settings race on interrupt reception
> git bisect good 9b61971cab92a918cab7168d439a351b1c799aca
> # bad: [e81395cfbe62518f41af79a1b287fc8fe7c96b37] usbnet: smsc95xx: Fix deadlock on runtime resume
> git bisect bad e81395cfbe62518f41af79a1b287fc8fe7c96b37
> # bad: [7eea9a60703caf1b04eccba93cd103f1c8fc6809] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
> git bisect bad 7eea9a60703caf1b04eccba93cd103f1c8fc6809
> # first bad commit: [7eea9a60703caf1b04eccba93cd103f1c8fc6809] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
> -------------------------------------------------------------------------------
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Groups.io Links: You receive all messages sent to this group.
> View/Reply Online (#30878): https://groups.io/g/kernelci-results/message/30878
> Mute This Topic: https://groups.io/mt/93235418/1131744
> Group Owner: kernelci-results+owner@...ups.io
> Unsubscribe: https://groups.io/g/kernelci-results/unsub [broonie@...nel.org]
> -=-=-=-=-=-=-=-=-=-=-=-
>
>
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists