lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CACdvmAjjgnLzipU7-A+RFjiqh3ijBaJ9gGcRzCHF_NxYevGu9w@mail.gmail.com> Date: Fri, 15 Jul 2022 02:58:04 -0400 From: Da Xue <da@...sconfused.com> To: Jerome Brunet <jbrunet@...libre.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@...com>, Alexandre Torgue <alexandre.torgue@...s.st.com>, Jose Abreu <joabreu@...opsys.com>, Erico Nunes <nunes.erico@...il.com>, netdev@...r.kernel.org, linux-amlogic@...ts.infradead.org, Kevin Hilman <khilman@...libre.com>, Neil Armstrong <narmstrong@...libre.com>, Vyacheslav <adeep@...ina.in>, Heiner Kallweit <hkallweit1@...il.com>, Qi Duan <qi.duan@...ogic.com> Subject: Re: [RFC/RFT PATCH] net: stmmac: do not poke MAC_CTRL_REG twice on link up On Wed, Jul 13, 2022 at 5:24 AM Da Xue <da@...sconfused.com> wrote: > > On Thu, Jul 7, 2022 at 6:14 AM Jerome Brunet <jbrunet@...libre.com> wrote: > > > > For some reason, poking MAC_CTRL_REG a second time, even with the same > > value, causes problem on a dwmac 3.70a. > > > > This problem happens on all the Amlogic SoCs, on link up, when the RMII > > 10/100 internal interface is used. The problem does not happen on boards > > using the external RGMII 10/100/1000 interface. Initially we suspected the > > PHY to be the problem but after a lot of testing, the problem seems to be > > coming from the MAC controller. > > > > > meson8b-dwmac c9410000.ethernet: IRQ eth_wake_irq not found > > > meson8b-dwmac c9410000.ethernet: IRQ eth_lpi not found > > > meson8b-dwmac c9410000.ethernet: PTP uses main clock > > > meson8b-dwmac c9410000.ethernet: User ID: 0x11, Synopsys ID: 0x37 > > > meson8b-dwmac c9410000.ethernet: DWMAC1000 > > > meson8b-dwmac c9410000.ethernet: DMA HW capability register supported > > > meson8b-dwmac c9410000.ethernet: RX Checksum Offload Engine supported > > > meson8b-dwmac c9410000.ethernet: COE Type 2 > > > meson8b-dwmac c9410000.ethernet: TX Checksum insertion supported > > > meson8b-dwmac c9410000.ethernet: Wake-Up On Lan supported > > > meson8b-dwmac c9410000.ethernet: Normal descriptors > > > meson8b-dwmac c9410000.ethernet: Ring mode enabled > > > meson8b-dwmac c9410000.ethernet: Enable RX Mitigation via HW Watchdog Timer > > > > The problem is not systematic. Its occurence is very random from 1/50 to > > 1/2. It is fairly easy to detect by setting the kernel to boot over NFS and > > possibly setting it to reboot automatically when reaching the prompt. > > > > When problem happens, the link is reported up by the PHY but no packet are > > actually going out. DHCP requests eventually times out and the kernel reset > > the interface. It may take several attempts but it will eventually work. > > > > > meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx > > > Sending DHCP requests ...... timed out! > > > meson8b-dwmac ff3f0000.ethernet eth0: Link is Down > > > IP-Config: Retrying forever (NFS root)... > > > meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.1:08] driver [Meson G12A Internal PHY] (irq=POLL) > > > meson8b-dwmac ff3f0000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0 > > > meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found > > > meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW > > > meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rmii link mode > > > meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx > > > Sending DHCP requests ...... timed out! > > > meson8b-dwmac ff3f0000.ethernet eth0: Link is Down > > > IP-Config: Retrying forever (NFS root)... > > > [...] 5 retries ... > > > IP-Config: Retrying forever (NFS root)... > > > meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.1:08] driver [Meson G12A Internal PHY] (irq=POLL) > > > meson8b-dwmac ff3f0000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0 > > > meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found > > > meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW > > > meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rmii link mode > > > meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx > > > Sending DHCP requests ., OK > > > IP-Config: Got DHCP answer from 10.1.1.1, my address is 10.1.3.229 > > > > Of course the same problem happens when not using NFS and it fairly > > difficult for IoT products to detect this situation and recover. > > > > The call to stmmac_mac_set() should be no-op in our case, the bits it sets > > have already been set by an earlier call to stmmac_mac_set(). However > > removing this call solves the problem. We have no idea why or what is the > > actual problem. > > > > Even weirder, keeping the call to stmmac_mac_set() but inserting a > > udelay(1) between writel() and stmmac_mac_set() solves the problem too. > > > > Suggested-by: Qi Duan <qi.duan@...ogic.com> > > Signed-off-by: Jerome Brunet <jbrunet@...libre.com> > > --- > > > > Hi, > > > > There is no intention to get this patch merged as it is. > > It is sent with the hope to get a better understanding of the issue > > and more testing. > > > > The discussion on this issue initially started on this thread > > https://lore.kernel.org/all/CAK4VdL3-BEBzgVXTMejrAmDjOorvoGDBZ14UFrDrKxVEMD2Zjg@mail.gmail.com/ > > > > The patches previously proposed in this thread have not solved the > > problem. > > > > The line removed in this patch should be a no-op when it comes to the > > value of MAC_CTRL_REG. So the change should make not a difference but > > it does. Testing result have been very good so far so there must be an > > unexpected consequence on the HW. I hope that someone with more > > knowledge on this controller will be able to shine some light on this. > > > > Cheers > > Jerome > > > > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > > index d1a7cf4567bc..3dca3cc61f39 100644 > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > > @@ -1072,7 +1072,6 @@ static void stmmac_mac_link_up(struct phylink_config *config, > > > > writel(ctrl, priv->ioaddr + MAC_CTRL_REG); > > > > - stmmac_mac_set(priv, priv->ioaddr, true); > > if (phy && priv->dma_cap.eee) { > > priv->eee_active = phy_init_eee(phy, 1) >= 0; > > priv->eee_enabled = stmmac_eee_init(priv); > > -- > > 2.36.1 > > > > We had a problem with GXL (S805X/S905X) where the ethernet interface > would sometimes not come up. Before the 5.10 LTS, it was just a matter > of bringing down and up (ip link set) the interface to fix the issue. > With 5.15, 5.18, and 5.19, we would get "meson8b-dwmac > c9410000.ethernet eth0: Reset adapter." No amount of link down ups can > fix it anymore. I realized that I did not add the ethernet reset in the device tree that u-boot was passing to Linux. Sorry about the noise on this. > > When we get the "meson8b-dwmac c9410000.ethernet eth0: Reset > adapter.", it affects traffic on the network switch. I have a ping > going from two different devices on a GS108PP PoE network switch and > it would go through the roof. When I remove the GXL board, everything > comes back to normal. Given that the reset fixes the ethernet issues, the hardware still could be causing this but it is no longer long enough to notice. > > We would get randomized corruption when ethernet is brought up > (successfully or not) about half the time. If it boots up without a > problem, it remains super stable. I would run benchmarks for CPU, 3D, > and ethernet for days without that glitch ever appearing. It seems to > be determined at startup. This is gone with ethernet reset in the device tree and the no double-poke register change Jerome provided. Best, Da
Powered by blists - more mailing lists