lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 25 Mar 2022 16:17:13 +0100 From: Andrew Lunn <andrew@...n.ch> To: Francesco Dolcini <francesco.dolcini@...adex.com> Cc: Russell King <linux@...linux.org.uk>, netdev <netdev@...r.kernel.org>, fugang.duan@....com, Chris Healy <cphealy@...il.com> Subject: Re: FEC MDIO timeout and polled IO On Fri, Mar 25, 2022 at 03:08:08PM +0100, Francesco Dolcini wrote: > Hello Andrew and all, > I was recently debugging an issue in the FEC driver, about 2% of the > time the driver is failing with "MDIO read timeout" at boot on a 5.4 > kernel. > > This issue is not new and from time to time appear again, it seems that > the previous interrupt based mechanism is somehow easy to break. > > I backported your patch > f166f890c8f0 (net: ethernet: fec: Replace interrupt driven MDIO with polled IO, 2020-05-02) > to kernel 5.4 and it seems that it fixes the issue (I was able to do 470 > power cycles, while before it was failing after a couple of hundreds > cycles best case). > > Shouldn't this patch be backported to kernel 5.4? Hi Francesco This patch was purely a performance boost, it was not a bug fix in any way. That change also caused a lot of pain. There are at least two different implementations of the MDIO bus in the FEC, and they behaviour slightly differently. So what worked for me with the Vybrid broke some other platforms. It took an NXP software engineer talking to there hardware guys to figure out how to do this correctly. Which is why you will see a complicated patch history. I personally would not recommend a back port, unless you can test the back port on a wide range of SoC with the FEC. If you are getting timeouts, i would suggest you look at whatever else is happening in the system during boot. Are interrupts getting disabled for too long? Is something blocking the running of the completion? Or just update to v5.15. Andrew
Powered by blists - more mailing lists