[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <519e17e0-9739-43bc-b77a-a77fd103d8d7@lunn.ch>
Date: Wed, 14 May 2025 14:32:22 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Romain Gantois <romain.gantois@...tlin.com>
Cc: Antoine Tenart <atenart@...nel.org>,
Heiner Kallweit <hkallweit1@...il.com>,
Russell King <linux@...linux.org.uk>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Maxime Chevallier <maxime.chevallier@...tlin.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next 3/3] net: phy: dp83869: Support 1000Base-X SFP
modules
> > > + /* Update advertisement */
> > > + if (mutex_trylock(&phydev->lock)) {
> > > + ret = dp83869_config_aneg(phydev);
> > > + mutex_unlock(&phydev->lock);
> > > + }
> >
> > Just skimmed through this quickly and it's not clear to me why aneg is
> > restarted only if there was no contention on the global phydev lock;
> > it's not guaranteed a concurrent holder would do the same. If this is
> > intended, a comment would be welcomed.
>
> The reasoning here is that there are code paths which call
> dp83869_port_configure_serdes() with phydev->lock already held, for example:
>
> phy_start() -> sfp_upstream_start() -> sfp_start() -> \
> sfp_sm_event() -> __sfp_sm_event() -> sfp_sm_module() -> \
> sfp_module_insert() -> phy_sfp_module_insert() -> \
> dp83869_port_configure_serdes()
>
> so taking this lock could result in a deadlock.
>
> mutex_trylock() is definitely not a perfect solution though, but I went with it
> partly because the marvell-88x2222 driver already does it this way, and partly
> because if phydev->lock() is held, then there's a solid chance that the phy
> state machine is already taking care of reconfiguring the advertisement.
> However, I'll admit that this is a bit of a shaky argument.
>
> If someone has a better solution in mind, I'll gladly hear it out, but for now
> I guess I'll just add a comment explaining why trylock() is being used.
The marvell10g driver should be the reference to look at.
As you say, phy_start() will eventually get around to calling
dp83869_config_aneg(). What is more interesting here are the paths
which lead to this function which don't result in a call to
dp83869_config_aneg(). What are those?
Andrew
Powered by blists - more mailing lists