[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240904124949.563f1343@fedora.home>
Date: Wed, 4 Sep 2024 12:49:49 +0200
From: Maxime Chevallier <maxime.chevallier@...tlin.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, Pantelis Antoniou <pantelis.antoniou@...il.com>,
Andrew Lunn <andrew@...n.ch>, Eric Dumazet <edumazet@...gle.com>, Paolo
Abeni <pabeni@...hat.com>, Russell King <linux@...linux.org.uk>, Christophe
Leroy <christophe.leroy@...roup.eu>, Florian Fainelli
<f.fainelli@...il.com>, Heiner Kallweit <hkallweit1@...il.com>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
thomas.petazzoni@...tlin.com, Herve Codina <herve.codina@...tlin.com>,
Simon Horman <horms@...nel.org>, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH net-next v2 7/7] net: ethernet: fs_enet: phylink
conversion
Hi Jakub,
On Mon, 2 Sep 2024 18:55:43 -0700
Jakub Kicinski <kuba@...nel.org> wrote:
> On Thu, 29 Aug 2024 18:15:30 +0200 Maxime Chevallier wrote:
> > @@ -582,15 +591,12 @@ static void fs_timeout_work(struct work_struct *work)
> >
> > dev->stats.tx_errors++;
> >
> > - spin_lock_irqsave(&fep->lock, flags);
> > -
> > - if (dev->flags & IFF_UP) {
> > - phy_stop(dev->phydev);
> > - (*fep->ops->stop)(dev);
> > - (*fep->ops->restart)(dev);
> > - }
> > + rtnl_lock();
>
> so we take rtnl_lock here..
>
> > + phylink_stop(fep->phylink);
> > + phylink_start(fep->phylink);
> > + rtnl_unlock();
> >
> > - phy_start(dev->phydev);
> > + spin_lock_irqsave(&fep->lock, flags);
> > wake = fep->tx_free >= MAX_SKB_FRAGS &&
> > !(CBDR_SC(fep->cur_tx) & BD_ENET_TX_READY);
> > spin_unlock_irqrestore(&fep->lock, flags);
>
> > @@ -717,19 +686,18 @@ static int fs_enet_close(struct net_device *dev)
> > unsigned long flags;
> >
> > netif_stop_queue(dev);
> > - netif_carrier_off(dev);
> > napi_disable(&fep->napi);
> > cancel_work_sync(&fep->timeout_work);
>
> ..and cancel_work_sync() under rtnl_lock here?
>
> IDK if removing the the "dev->flags & IFF_UP" check counts as
> meaningfully making it worse, but we're going in the wrong direction.
> The _sync() has to go, and the timeout work needs to check if device
> has been closed under rtnl_lock ?
Arg that's true, I didn't consider that call path at all... Sorry about
that, I'll indeed rework that to address this deadlock waiting to
happen.
Thanks,
Maxime
Powered by blists - more mailing lists