lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 17 Aug 2022 11:18:00 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     Florian Fainelli <f.fainelli@...il.com>
Cc:     Marek Szyprowski <m.szyprowski@...sung.com>,
        netdev <netdev@...r.kernel.org>,
        Steve Glendinning <steve.glendinning@...well.net>,
        Doug Berger <opendmb@...il.com>, Andrew Lunn <andrew@...n.ch>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        open list <linux-kernel@...r.kernel.org>,
        Linux-Renesas <linux-renesas-soc@...r.kernel.org>,
        Sergey Shtylyov <s.shtylyov@....ru>
Subject: Re: [PATCH net] net: phy: Warn about incorrect mdio_bus_phy_resume() state

Hi Florian,

On Wed, Aug 17, 2022 at 4:28 AM Florian Fainelli <f.fainelli@...il.com> wrote:
> On 8/16/2022 6:20 AM, Geert Uytterhoeven wrote:
> > On Fri, Aug 12, 2022 at 6:39 PM Florian Fainelli <f.fainelli@...il.com> wrote:
> >> On 8/12/22 04:19, Marek Szyprowski wrote:
> >>> On 02.08.2022 01:34, Florian Fainelli wrote:
> >>>> Calling mdio_bus_phy_resume() with neither the PHY state machine set to
> >>>> PHY_HALTED nor phydev->mac_managed_pm set to true is a good indication
> >>>> that we can produce a race condition looking like this:
> >>>>
> >>>> CPU0                                         CPU1
> >>>> bcmgenet_resume
> >>>>     -> phy_resume
> >>>>       -> phy_init_hw
> >>>>     -> phy_start
> >>>>       -> phy_resume
> >>>>                                                    phy_start_aneg()
> >>>> mdio_bus_phy_resume
> >>>>     -> phy_resume
> >>>>        -> phy_write(..., BMCR_RESET)
> >>>>         -> usleep()                                  -> phy_read()
> >>>>
> >>>> with the phy_resume() function triggering a PHY behavior that might have
> >>>> to be worked around with (see bf8bfc4336f7 ("net: phy: broadcom: Fix
> >>>> brcm_fet_config_init()") for instance) that ultimately leads to an error
> >>>> reading from the PHY.
> >>>>
> >>>> Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
> >>>> Signed-off-by: Florian Fainelli <f.fainelli@...il.com>
> >>>
> >>> This patch, as probably intended, triggers a warning during system
> >>> suspend/resume cycle in the SMSC911x driver. I've observed it on ARM
> >>> Juno R1 board on the kernel compiled from next-202208010:
> >>>
> >>>     ------------[ cut here ]------------
> >>>     WARNING: CPU: 1 PID: 398 at drivers/net/phy/phy_device.c:323
> >>> mdio_bus_phy_resume+0x34/0xc8
> >
> > I am seeing the same on the ape6evm and kzm9g development
> > boards with smsc911x Ethernet, and on various boards with Renesas
> > Ethernet (sh_eth or ravb) if Wake-on-LAN is disabled.
> >
> >> Yes this is catching an actual issue in the driver in that the PHY state
> >> machine is still running while the system is trying to suspend. We could
> >> go about fixing it in a different number of ways, though I believe this
> >> one is probably correct enough to work and fix the warning:
> >
> >> --- a/drivers/net/ethernet/smsc/smsc911x.c
> >> +++ b/drivers/net/ethernet/smsc/smsc911x.c
> >> @@ -1037,6 +1037,8 @@ static int smsc911x_mii_probe(struct net_device *dev)
> >>                   return ret;
> >>           }
> >>
> >> +       /* Indicate that the MAC is responsible for managing PHY PM */
> >> +       phydev->mac_managed_pm = true;
> >>           phy_attached_info(phydev);
> >>
> >>           phy_set_max_speed(phydev, SPEED_100);
> >> @@ -2587,6 +2589,8 @@ static int smsc911x_suspend(struct device *dev)
> >>           if (netif_running(ndev)) {
> >>                   netif_stop_queue(ndev);
> >>                   netif_device_detach(ndev);
> >> +               if (!device_may_wakeup(dev))
> >> +                       phy_suspend(dev->phydev);
> >>           }
> >>
> >>           /* enable wake on LAN, energy detection and the external PME
> >> @@ -2628,6 +2632,8 @@ static int smsc911x_resume(struct device *dev)
> >>           if (netif_running(ndev)) {
> >>                   netif_device_attach(ndev);
> >>                   netif_start_queue(ndev);
> >> +               if (!device_may_wakeup(dev))
> >> +                       phy_resume(dev->phydev);
> >>           }
> >>
> >>           return 0;
> >
> > Thanks for your patch, but unfortunately this does not work on ape6evm
> > and kzm9g, where the smsc911x device is connected to a power-managed
> > bus.  It looks like the PHY registers are accessed while the device
> > is already suspended, causing a crash during system suspend:
>
> Does it work better if you replace phy_suspend() with phy_stop() and
> phy_resume() with phy_start()?

Thank you, much better!
Tested-by: Geert Uytterhoeven <geert+renesas@...der.be>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ