[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DB8PR04MB6795BB2A13AED5F6E56D08A0E6CE9@DB8PR04MB6795.eurprd04.prod.outlook.com>
Date: Thu, 2 Sep 2021 07:28:44 +0000
From: Joakim Zhang <qiangqing.zhang@....com>
To: Russell King <linux@...linux.org.uk>
CC: Vladimir Oltean <olteanv@...il.com>,
"peppe.cavallaro@...com" <peppe.cavallaro@...com>,
"alexandre.torgue@...s.st.com" <alexandre.torgue@...s.st.com>,
"joabreu@...opsys.com" <joabreu@...opsys.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuba@...nel.org" <kuba@...nel.org>,
"mcoquelin.stm32@...il.com" <mcoquelin.stm32@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"andrew@...n.ch" <andrew@...n.ch>,
"f.fainelli@...il.com" <f.fainelli@...il.com>,
"hkallweit1@...il.com" <hkallweit1@...il.com>,
dl-linux-imx <linux-imx@....com>
Subject: RE: [PATCH] net: stmmac: fix MAC not working when system resume back
with WoL enabled
Hi Russell,
> -----Original Message-----
> From: Russell King <linux@...linux.org.uk>
> Sent: 2021年9月1日 21:26
> To: Joakim Zhang <qiangqing.zhang@....com>
> Cc: Vladimir Oltean <olteanv@...il.com>; peppe.cavallaro@...com;
> alexandre.torgue@...s.st.com; joabreu@...opsys.com;
> davem@...emloft.net; kuba@...nel.org; mcoquelin.stm32@...il.com;
> netdev@...r.kernel.org; andrew@...n.ch; f.fainelli@...il.com;
> hkallweit1@...il.com; dl-linux-imx <linux-imx@....com>
> Subject: Re: [PATCH] net: stmmac: fix MAC not working when system resume
> back with WoL enabled
>
> On Wed, Sep 01, 2021 at 11:42:08AM +0000, Joakim Zhang wrote:
> > Hi Vladimir,
> >
> > > -----Original Message-----
> > > From: Vladimir Oltean <olteanv@...il.com>
> > > Sent: 2021年9月1日 18:56
> > > To: Joakim Zhang <qiangqing.zhang@....com>
> > > Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> > > joabreu@...opsys.com; davem@...emloft.net; kuba@...nel.org;
> > > mcoquelin.stm32@...il.com; linux@...linux.org.uk;
> > > netdev@...r.kernel.org; andrew@...n.ch; f.fainelli@...il.com;
> > > hkallweit1@...il.com; dl-linux-imx <linux-imx@....com>
> > > Subject: Re: [PATCH] net: stmmac: fix MAC not working when system
> > > resume back with WoL enabled
> > >
> > > On Wed, Sep 01, 2021 at 10:25:15AM +0000, Joakim Zhang wrote:
> > > >
> > > > Hi Vladimir,
> > > >
> > > > > -----Original Message-----
> > > > > From: Vladimir Oltean <olteanv@...il.com>
> > > > > Sent: 2021年9月1日 17:22
> > > > > To: Joakim Zhang <qiangqing.zhang@....com>
> > > > > Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> > > > > joabreu@...opsys.com; davem@...emloft.net; kuba@...nel.org;
> > > > > mcoquelin.stm32@...il.com; linux@...linux.org.uk;
> > > > > netdev@...r.kernel.org; andrew@...n.ch; f.fainelli@...il.com;
> > > > > hkallweit1@...il.com; dl-linux-imx <linux-imx@....com>
> > > > > Subject: Re: [PATCH] net: stmmac: fix MAC not working when
> > > > > system resume back with WoL enabled
> > > > >
> > > > > On Wed, Sep 01, 2021 at 05:02:28PM +0800, Joakim Zhang wrote:
> > > > > > We can reproduce this issue with below steps:
> > > > > > 1) enable WoL on the host
> > > > > > 2) host system suspended
> > > > > > 3) remote client send out wakeup packets We can see that host
> > > > > > system resume back, but can't work, such as ping failed.
> > > > > >
> > > > > > After a bit digging, this issue is introduced by the commit
> > > > > > 46f69ded988d
> > > > > > ("net: stmmac: Use resolved link config in mac_link_up()"),
> > > > > > which use the finalised link parameters in mac_link_up()
> > > > > > rather than the parameters in mac_config().
> > > > > >
> > > > > > There are two scenarios for MAC suspend/resume:
> > > > > >
> > > > > > 1) MAC suspend with WoL disabled, stmmac_suspend() call
> > > > > > phylink_mac_change() to notify phylink machine that a change
> > > > > > in MAC state, then .mac_link_down callback would be invoked.
> > > > > > Further, it will call phylink_stop() to stop the phylink
> > > > > > instance. When MAC resume back, firstly phylink_start() is
> > > > > > called to start the phylink instance, then call
> > > > > > phylink_mac_change() which will finally trigger phylink
> > > > > > machine to invoke .mac_config and .mac_link_up callback. All
> > > > > > is fine since configuration in these two callbacks
> > > will be initialized.
> > > > > >
> > > > > > 2) MAC suspend with WoL enabled, phylink_mac_change() will put
> > > > > > link down, but there is no phylink_stop() to stop the phylink
> > > > > > instance, so it will link up again, that means .mac_config and
> > > > > > .mac_link_up would be invoked before system suspended. After
> > > > > > system resume back, it will do DMA initialization and SW reset
> > > > > > which let MAC lost the hardware setting (i.e MAC_Configuration
> > > > > > register(offset 0x0) is reset). Since link is up before system
> > > > > > suspended, so .mac_link_up would not be invoked after system
> > > > > > resume back, lead to there is no chance to initialize the
> > > > > > configuration in .mac_link_up callback, as a result, MAC can't
> > > > > > work any
> > > longer.
> > > > >
> > > > > Have you tried putting phylink_stop in .suspend, and
> > > > > phylink_start
> > > in .resume?
> > > >
> > > > Yes, I tried, but the system can't be wakeup with remote packets.
> > > > Please see the code change.
> > >
> > > That makes it a PHY driver issue then, I guess?
> > > At least some PHY drivers avoid suspending when WoL is active, like
> > > lan88xx_suspend.
> > > Even the phy_suspend function takes wol.wolopts into consideration
> > > before proceeding to call the driver. What PHY driver is it?
> >
> > I think it's not the PHY issue, since both STMMAC and FEC controllers
> > on i.MX8MP use the same PHY(Realtek RTL8211FD,
> drivers/net/phy/realtek.c), there is no issue with FEC.
>
> Note that FEC calls phylink_stop() in fec_suspend() if the net device was up. So
> that kind of rules out phylink and phylib too... and points towards stmmac doing
> something it shouldn't.
Yes, I also compared the logic between FEC and STMMAC, for FEC, both WoL active and inactive
will invoke phy_stop() when suspend, and phy_start() when resume, so that fec_enet_adjust_link()
would be called to adjust link, let FEC can work correctly.
> > > Bad assumption in the stmmac driver, if the intention was for the
> > > link state change to be induced to phylink after the resume?
> >
> > Yes, I also think link state change should be captured after the
> > resume, it's very strange that link up again before suspended. You would see
> below log if I add no_console_suspend in cmdline.
>
> ... because phylink_mac_change() is not supposed to be used to force the link
> down. I can't say this loudly enough: Read the documentation.
> I don't write it just for my pleasure, it's there to help others get stuff correct. If
> people aren't going to read it, I might as well not waste the time writing it.
The documentation is very valuable, and worthy of respect. I am sorry for not notice the doc before.
And sometimes I am not quite understand them. Sorry again.
> * phylink_mac_change() - notify phylink of a change in MAC state
> * @pl: a pointer to a &struct phylink returned from phylink_create()
> * @up: indicates whether the link is currently up.
> *
> * The MAC driver should call this driver when the state of its link
> * changes (eg, link failure, new negotiation results, etc.)
>
> Realise that "up" is there merely to capture that the link has gone down - but
> by the time phylink reacts to that (which may be some time
> *after* this call has been made - it is *not* synchronous since it's meant to be
> called from an *interrupt*) the link state may well have changed again. So,
> phylink will always recheck the link state with up = false, so you _will_ get the
> link going down and then up.
Yes, as I described in the commit message, I noticed that phylink_mac_change() will not stop
the phylink instance, so it will link up again before system suspended. What I want to describe here
is that current STMMAC suspend/resume path is not correct for WoL active.
> In any case "change in MAC state" is only applicable when in in-band mode, not
> in PHY mode, so you should not be calling this if you have a PHY attached which
> isn't in in-band mode.
Got it, thanks.
> >
> > root@...8mpevk:~# ethtool -s eth1 wol g
> > [ 76.309460] stmmac: wakeup enable
>
> So you've asked it to wake on MagicPacket, which is WAKE_MAGIC. As you got
> the message "wakeup enable" which is emitted by stmmac_set_wol(), this will
> only be emitted if priv->plat->pmt() is set. It will _not_ call
> phylink_ethtool_set_wol().
Right.
> So, you are not using the PHY-based wake-on-lan, but the MAC based
> wake-on-lan.
Right.
> This means you need to have the phy <-> mac link up during
> suspend, and in that case, yes, you do not want to call
> phylink_stop() or phylink_start().
I have a question here, why need to have the phy<->mac link up during suspend?
What I understand is we need ensure PHY not suspended. Such as FEC, it call phy_stop()
when suspend for MAC-based WoL active, then it also can be waked up by magic packets.
As you described in past thread, phylink_stop() and phylink_start() also need to be called even with
WoL active.
> I'm not sure what stmmac_pmt() does - thanks to the macro stuff, I'd need to
> trace it through the driver and find out where that goes, and also which variant
> of stmmac you're using... so without more information I can't follow what the
> driver is doing.
I think stmmac_pmt() has no effect, it just program the hardware to enable the WoL feature, please see below:
https://elixir.bootlin.com/linux/v5.14-rc7/source/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c#L300
I use the version 5.10a.
Best Regards,
Joakim Zhang
Powered by blists - more mailing lists