lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 1 Sep 2021 11:42:08 +0000
From:   Joakim Zhang <qiangqing.zhang@....com>
To:     Vladimir Oltean <olteanv@...il.com>
CC:     "peppe.cavallaro@...com" <peppe.cavallaro@...com>,
        "alexandre.torgue@...s.st.com" <alexandre.torgue@...s.st.com>,
        "joabreu@...opsys.com" <joabreu@...opsys.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "mcoquelin.stm32@...il.com" <mcoquelin.stm32@...il.com>,
        "linux@...linux.org.uk" <linux@...linux.org.uk>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "andrew@...n.ch" <andrew@...n.ch>,
        "f.fainelli@...il.com" <f.fainelli@...il.com>,
        "hkallweit1@...il.com" <hkallweit1@...il.com>,
        dl-linux-imx <linux-imx@....com>
Subject: RE: [PATCH] net: stmmac: fix MAC not working when system resume back
 with WoL enabled


Hi Vladimir,

> -----Original Message-----
> From: Vladimir Oltean <olteanv@...il.com>
> Sent: 2021年9月1日 18:56
> To: Joakim Zhang <qiangqing.zhang@....com>
> Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> joabreu@...opsys.com; davem@...emloft.net; kuba@...nel.org;
> mcoquelin.stm32@...il.com; linux@...linux.org.uk;
> netdev@...r.kernel.org; andrew@...n.ch; f.fainelli@...il.com;
> hkallweit1@...il.com; dl-linux-imx <linux-imx@....com>
> Subject: Re: [PATCH] net: stmmac: fix MAC not working when system resume
> back with WoL enabled
> 
> On Wed, Sep 01, 2021 at 10:25:15AM +0000, Joakim Zhang wrote:
> >
> > Hi Vladimir,
> >
> > > -----Original Message-----
> > > From: Vladimir Oltean <olteanv@...il.com>
> > > Sent: 2021年9月1日 17:22
> > > To: Joakim Zhang <qiangqing.zhang@....com>
> > > Cc: peppe.cavallaro@...com; alexandre.torgue@...s.st.com;
> > > joabreu@...opsys.com; davem@...emloft.net; kuba@...nel.org;
> > > mcoquelin.stm32@...il.com; linux@...linux.org.uk;
> > > netdev@...r.kernel.org; andrew@...n.ch; f.fainelli@...il.com;
> > > hkallweit1@...il.com; dl-linux-imx <linux-imx@....com>
> > > Subject: Re: [PATCH] net: stmmac: fix MAC not working when system
> > > resume back with WoL enabled
> > >
> > > On Wed, Sep 01, 2021 at 05:02:28PM +0800, Joakim Zhang wrote:
> > > > We can reproduce this issue with below steps:
> > > > 1) enable WoL on the host
> > > > 2) host system suspended
> > > > 3) remote client send out wakeup packets We can see that host
> > > > system resume back, but can't work, such as ping failed.
> > > >
> > > > After a bit digging, this issue is introduced by the commit
> > > > 46f69ded988d
> > > > ("net: stmmac: Use resolved link config in mac_link_up()"), which
> > > > use the finalised link parameters in mac_link_up() rather than the
> > > > parameters in mac_config().
> > > >
> > > > There are two scenarios for MAC suspend/resume:
> > > >
> > > > 1) MAC suspend with WoL disabled, stmmac_suspend() call
> > > > phylink_mac_change() to notify phylink machine that a change in
> > > > MAC state, then .mac_link_down callback would be invoked. Further,
> > > > it will call phylink_stop() to stop the phylink instance. When MAC
> > > > resume back, firstly phylink_start() is called to start the
> > > > phylink instance, then call phylink_mac_change() which will
> > > > finally trigger phylink machine to invoke .mac_config and
> > > > .mac_link_up callback. All is fine since configuration in these two callbacks
> will be initialized.
> > > >
> > > > 2) MAC suspend with WoL enabled, phylink_mac_change() will put
> > > > link down, but there is no phylink_stop() to stop the phylink
> > > > instance, so it will link up again, that means .mac_config and
> > > > .mac_link_up would be invoked before system suspended. After
> > > > system resume back, it will do DMA initialization and SW reset
> > > > which let MAC lost the hardware setting (i.e MAC_Configuration
> > > > register(offset 0x0) is reset). Since link is up before system
> > > > suspended, so .mac_link_up would not be invoked after system
> > > > resume back, lead to there is no chance to initialize the
> > > > configuration in .mac_link_up callback, as a result, MAC can't work any
> longer.
> > >
> > > Have you tried putting phylink_stop in .suspend, and phylink_start
> in .resume?
> >
> > Yes, I tried, but the system can't be wakeup with remote packets.
> > Please see the code change.
> 
> That makes it a PHY driver issue then, I guess?
> At least some PHY drivers avoid suspending when WoL is active, like
> lan88xx_suspend.
> Even the phy_suspend function takes wol.wolopts into consideration before
> proceeding to call the driver. What PHY driver is it?

I think it's not the PHY issue, since both STMMAC and FEC controllers on i.MX8MP use the same
PHY(Realtek RTL8211FD, drivers/net/phy/realtek.c), there is no issue with FEC.


> > > Do you know exactly why it used to work prior to this patch?
> >
> > Yes, since it configures the MAC_CTRL_REG register in .mac_config
> > callback, it will be called when system resume back with WoL enabled.
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felix
> >
> ir.bootlin.com%2Flinux%2Fv5.4.143%2Fsource%2Fdrivers%2Fnet%2Fethernet%
> >
> 2Fstmicro%2Fstmmac%2Fstmmac_main.c%23L852&amp;data=04%7C01%7Cq
> iangqing
> > .zhang%40nxp.com%7C412a8b69c1244d4c4ab708d96d371e52%7C686ea1d3
> bc2b4c6f
> >
> a92cd99c5c301635%7C0%7C0%7C637660905771744718%7CUnknown%7CTW
> FpbGZsb3d8
> >
> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
> %7C1
> >
> 000&amp;sdata=8ony5ZI%2BF31eDduK9Qh0CVIPDHE3EiBdnab6osiyUhc%3D&
> amp;res
> > erved=0
> >
> > If configure the MAC_CTRL_REG register in .mac_link_up callback, when
> > system resume back with WoL active, .mac_link_up would not be called, so
> MAC can't work any longer.
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felix
> >
> ir.bootlin.com%2Flinux%2Fv5.14-rc7%2Fsource%2Fdrivers%2Fnet%2Fethernet
> > %2Fstmicro%2Fstmmac%2Fstmmac_main.c%23L1044&amp;data=04%7C01
> %7Cqiangqi
> >
> ng.zhang%40nxp.com%7C412a8b69c1244d4c4ab708d96d371e52%7C686ea1d
> 3bc2b4c
> >
> 6fa92cd99c5c301635%7C0%7C0%7C637660905771744718%7CUnknown%7CT
> WFpbGZsb3
> >
> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> 3D%7
> >
> C1000&amp;sdata=O%2B%2BUW01PUL4Tp1yt3kQ0bhWIo%2Buc37RFUENLcla
> C6AM%3D&a
> > mp;reserved=0
> 
> Ok, so it worked because phylink_mac_change triggers a phylink resolve, and
> that function calls phylink_mac_config if the link is up (which it is), but
> phylink_link_up only if the link state actually changed (which it did not).
> So you are saying that the momentary link flap induced by
> phylink_mac_change(false), which set pl->mac_link_dropped = true, all
> consumed itself _before_ the actual suspend, and therefore does not help after
> the resume. Interesting behavior.

Yes, what I have seen at my side is what you concluded.

> Bad assumption in the stmmac driver, if the intention was for the link state
> change to be induced to phylink after the resume?

Yes, I also think link state change should be captured after the resume, it's very strange that
link up again before suspended. You would see below log if I add no_console_suspend in cmdline.

root@...8mpevk:~# ethtool -s eth1 wol g
[   76.309460] stmmac: wakeup enable
root@...8mpevk:~# echo mem > /sys/power/state
[   83.278489] PM: suspend entry (deep)
[   83.285371] Filesystems sync: 0.003 seconds
[   83.293310] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   83.301833] OOM killer disabled.
[   83.305069] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   83.938416] imx-dwmac 30bf0000.ethernet eth1: Link is Down                                      -----> link down
[   83.945022] imx-dwmac 30bf0000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx              ----> link up
[   83.986280] PM: suspend devices took 0.672 seconds
[   83.994724] Disabling non-boot CPUs ...
[   83.999007] CPU1: shutdown
[   84.001727] psci: CPU1 killed (polled 0 ms)
[   84.007315] IRQ 14: no longer affine to CPU2
[   84.007451] CPU2: shutdown
[   84.014445] psci: CPU2 killed (polled 0 ms)
[   84.020220] CPU3: shutdown
[   84.022933] psci: CPU3 killed (polled 0 ms)

Best Regards,
Joakim Zhang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ