[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220503161356.GA35226@francesco-nb.int.toradex.com>
Date: Tue, 3 May 2022 18:13:56 +0200
From: Francesco Dolcini <francesco.dolcini@...adex.com>
To: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org
Cc: Francesco Dolcini <francesco.dolcini@...adex.com>,
Andy Duan <fugang.duan@....com>,
Joakim Zhang <qiangqing.zhang@....com>,
Heiner Kallweit <hkallweit1@...il.com>,
Russell King <linux@...linux.org.uk>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
Fabio Estevam <festevam@...il.com>,
Tim Harvey <tharvey@...eworks.com>,
Chris Healy <cphealy@...il.com>
Subject: Re: FEC MDIO read timeout on linkup
Hello all,
On Mon, May 02, 2022 at 08:34:43PM +0200, Francesco Dolcini wrote:
> On Mon, May 02, 2022 at 08:24:53PM +0200, Andrew Lunn wrote:
> > > writing to this register could trigger a FEC_ENET_MII interrupt actually
> > > creating a race condition with fec_enet_mdio_read() that is called on
> > > link change also.
> >
> > An unexpected interrupt will make this exit too early, and the read
> > will get invalid data. An unexpected interrupt would not cause a
> > timeout here, which is what you are reporting.
>
> I guess I need to sleep on this, in the meantime I have a test running
> with the change I described running since a couple of hours.
After a long sleep it seems that my change did not solve the issue. I
also verified that writing to the FEC_MII_SPEED does not trigger any
FEC_ENET_MII interrupt on my specific case.
I guess that this could be still a real issue, but it's not my specific
problem.
At the moment I'm a little bit lost, what I have verified so far is the
following:
- fec_enet_mdio_read()/_write() locking. This is just correct, with the
mdio mutex.
- potential race condition with FEC_ENET_MII interrupt while writing
FEC_MII_SPEED in fec_restart(). Proved wrong by both a test and by the
fact that I do not have an interrupt generated on my case.
- increasing fec_enet_mdio_wait() timeout to 100ms does not help.
- clk_ipg is always active, once the device is open the clock is always
on (verified with runtime power management debugging)
I'm wondering could this be related to
fec_enet_adjust_link()->fec_restart() during a fec_enet_mdio_read()
and one of the many register write in fec_restart() just creates the
issue, maybe while resetting the FEC? Does this makes any sense?
Francesco
Powered by blists - more mailing lists