[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161211201104.GB20574@amd>
Date: Sun, 11 Dec 2016 21:11:04 +0100
From: Pavel Machek <pavel@....cz>
To: Lino Sanfilippo <LinoSanfilippo@....de>
Cc: Francois Romieu <romieu@...zoreil.com>, bh74.an@...sung.com,
ks.giri@...sung.com, vipul.pandya@...sung.com,
peppe.cavallaro@...com, alexandre.torgue@...com,
davem@...emloft.net, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
Hi!
> On 09.12.2016 12:21, Pavel Machek wrote:
> > On Fri 2016-12-09 00:19:43, Francois Romieu wrote:
> >> Lino Sanfilippo <LinoSanfilippo@....de> :
> >> [...]
> >> > OTOH Pavel said that he actually could produce a deadlock. Now I wonder if
> >> > this is caused by that locking scheme (in a way I have not figured out yet)
> >> > or if it is a different issue.
> >>
> >> stmmac_tx_err races with stmmac_xmit.
> >
> > Umm, yes, that looks real.
> >
> > And that means that removing tx_lock will not be completely trivial
> > :-(. Lino, any ideas there?
> >
>
> Ok, the race is there but it looks like a problem that is not related to
> the use or removal of the private lock.
Well, removal of the private lock will make it trickier to fix :-(.
> By a glimpse into other drivers (e.g sky2 or e1000), a possible way to handle a
> tx error is to start a separate task and restart the tx path in that task instead
> the irq handler (or timer in case of the watchdog).
>
> In that task we could do:
> 1. deactivate napi
> 2. deactivate irqs
> 3. wait for running napi/irqs do complete (_sync)
> 4. call stmmac_tx_err()
> 5. reenable napi
> 6. reenable irqs
>
> We have to ensure that no xmit() is executing while stmmac_tx_err() does the cleanup,
> so stmmac_tx_err() should IMO rather call netif_tx_disable() instead of netif_stop_queue()
> (the former grabs the xmit lock before it sets __QUEUE_STATE_DRV_XOFF to disable
> the queue).
Do you understand what stmmac_tx_err(priv); is supposed to do? In
particular, if it is called while the driver is working ok -- should
the driver survive that?
Because it does not currently, and I don't know how to test that
code. Unplugging the cable does not provoke that.
I tried
} else if (unlikely(status == tx_hard_error))
stmmac_tx_err(priv);
+
+ {
+ static int i;
+ i++;
+ if (i==1000) {
+ i = 0;
+ printk("Simulated error\n");
+ stmmac_tx_err(priv);
+ }
+ }
}
but the driver does not survive that :-(.
Best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)
Powered by blists - more mailing lists