lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DB8PR04MB67959C4B1D1AFEC5AEEB73F3E6CD9@DB8PR04MB6795.eurprd04.prod.outlook.com>
Date:   Wed, 1 Sep 2021 10:21:59 +0000
From:   Joakim Zhang <qiangqing.zhang@....com>
To:     Russell King <linux@...linux.org.uk>
CC:     "peppe.cavallaro@...com" <peppe.cavallaro@...com>,
        "alexandre.torgue@...s.st.com" <alexandre.torgue@...s.st.com>,
        "joabreu@...opsys.com" <joabreu@...opsys.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "mcoquelin.stm32@...il.com" <mcoquelin.stm32@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "andrew@...n.ch" <andrew@...n.ch>,
        "f.fainelli@...il.com" <f.fainelli@...il.com>,
        "hkallweit1@...il.com" <hkallweit1@...il.com>,
        dl-linux-imx <linux-imx@....com>
Subject: RE: [PATCH] net: stmmac: fix MAC not working when system resume back
 with WoL enabled


Hi Russell,

> -----Original Message-----
> From: Russell King <linux@...linux.org.uk>
> Sent: 2021年9月1日 17:14
> To: Joakim Zhang <qiangqing.zhang@....com>
> Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> joabreu@...opsys.com; davem@...emloft.net; kuba@...nel.org;
> mcoquelin.stm32@...il.com; netdev@...r.kernel.org; andrew@...n.ch;
> f.fainelli@...il.com; hkallweit1@...il.com; dl-linux-imx <linux-imx@....com>
> Subject: Re: [PATCH] net: stmmac: fix MAC not working when system resume
> back with WoL enabled
> 
> On Wed, Sep 01, 2021 at 05:02:28PM +0800, Joakim Zhang wrote:
> > We can reproduce this issue with below steps:
> > 1) enable WoL on the host
> > 2) host system suspended
> > 3) remote client send out wakeup packets We can see that host system
> > resume back, but can't work, such as ping failed.
> >
> > After a bit digging, this issue is introduced by the commit
> > 46f69ded988d
> > ("net: stmmac: Use resolved link config in mac_link_up()"), which use
> > the finalised link parameters in mac_link_up() rather than the
> > parameters in mac_config().
> >
> > There are two scenarios for MAC suspend/resume:
> >
> > 1) MAC suspend with WoL disabled, stmmac_suspend() call
> > phylink_mac_change() to notify phylink machine that a change in MAC
> > state, then .mac_link_down callback would be invoked. Further, it will
> > call phylink_stop() to stop the phylink instance. When MAC resume
> > back, firstly phylink_start() is called to start the phylink instance,
> > then call phylink_mac_change() which will finally trigger phylink
> > machine to invoke .mac_config and .mac_link_up callback. All is fine
> > since configuration in these two callbacks will be initialized.
> >
> > 2) MAC suspend with WoL enabled, phylink_mac_change() will put link
> > down, but there is no phylink_stop() to stop the phylink instance, so
> > it will link up again, that means .mac_config and .mac_link_up would
> > be invoked before system suspended. After system resume back, it will
> > do DMA initialization and SW reset which let MAC lost the hardware
> > setting (i.e MAC_Configuration register(offset 0x0) is reset). Since
> > link is up before system suspended, so .mac_link_up would not be
> > invoked after system resume back, lead to there is no chance to
> > initialize the configuration in .mac_link_up callback, as a result,
> > MAC can't work any longer.
> >
> > Above description is what I found when debug this issue, this patch is
> > just revert broken patch to workaround it, at least make MAC work when
> > system resume back with WoL enabled.
> >
> > Said this is a workaround, since it has not resolve the issue completely.
> > I just move the speed/duplex/pause etc into .mac_config callback,
> > there are other configurations in .mac_link_up callback which also
> > need to be initialized to work for specific functions.
> 
> NAK. Please read the phylink documentation. speed/duplex/pause is undefined
> in .mac_config.

Speed/duplex/pause also the field of " struct phylink_link_state", so these can be refered in .mac_config, please
see the link which stmmac did before:
https://elixir.bootlin.com/linux/v5.4.143/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L852


> I think the problem here is that you're not calling phylink_stop() when WoL is
> enabled, which means phylink will continue to maintain the state as per the
> hardware state, and phylib will continue to run its state machine reporting the
> link state to phylink.

Yes, I also tried do below code change, but the host would not be wakeup, phylink_stop() would
call phy_stop(), phylib would call phy_suspend() finally, it will not suspend phy if it detect WoL enabled,
so now I don't know why system can't be wakeup with this code change.

@@ -5374,7 +5374,6 @@ int stmmac_suspend(struct device *dev)
                rtnl_lock();
                if (device_may_wakeup(priv->device))
                        phylink_speed_down(priv->phylink, false);
-               phylink_stop(priv->phylink);
                rtnl_unlock();
                mutex_lock(&priv->lock);

@@ -5385,6 +5384,10 @@ int stmmac_suspend(struct device *dev)
        }
        mutex_unlock(&priv->lock);

+       rtnl_lock();
+       phylink_stop(priv->phylink);
+       rtnl_unlock();
+
        priv->speed = SPEED_UNKNOWN;
        return 0;
 }
@@ -5448,6 +5451,12 @@ int stmmac_resume(struct device *dev)
                pinctrl_pm_select_default_state(priv->device);
                if (priv->plat->clk_ptp_ref)
                        clk_prepare_enable(priv->plat->clk_ptp_ref);
+
+               rtnl_lock();
+               /* We may have called phylink_speed_down before */
+               phylink_speed_up(priv->phylink);
+               rtnl_unlock();
+
                /* reset the phy so that it's ready */
                if (priv->mii && priv->mdio_rst_after_resume)
                        stmmac_mdio_reset(priv->mii);
@@ -5461,13 +5470,9 @@ int stmmac_resume(struct device *dev)
                        return ret;
        }

-       if (!device_may_wakeup(priv->device) || !priv->plat->pmt) {
-               rtnl_lock();
-               phylink_start(priv->phylink);
-               /* We may have called phylink_speed_down before */
-               phylink_speed_up(priv->phylink);
-               rtnl_unlock();
-       }
+       rtnl_lock();
+       phylink_start(priv->phylink);
+       rtnl_unlock();

        rtnl_lock();
        mutex_lock(&priv->lock);


> phylink_stop() (and therefore phy_stop()) should be called even if WoL is active
> to shut down this state reporting, as other network drivers do.

Ok, you mean that phylink_stop() also should be called even if WoL is active, I would look in this direction since
you are a professional.

Thanks.

Best Regards,
Joakim Zhang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ