[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1442528065.97487.6.camel@infradead.org>
Date: Thu, 17 Sep 2015 23:14:25 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: Francois Romieu <romieu@...zoreil.com>
Cc: Stephen Hemminger <stephen@...workplumber.org>,
David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH net 2/2] 8139cp: reset BQL when ring tx ring cleared
On Thu, 2015-09-17 at 22:44 +0200, Francois Romieu wrote:
> David Woodhouse <dwmw2@...radead.org> :
> > On Thu, 2015-09-17 at 12:36 +0100, David Woodhouse wrote:
> > >
> > > Thanks; I'll try that. In fact since updating to 4.2 the problem has
> > > got worse — now the whole machine dies:
> >
> > There is something very strange going on here. I've found two ways to
> > make it stop crashing when cp_tx_timeout() hits the 'popf' when
> > unlocking the spinlock.
>
> cp_tx_timeout takes lock, disables irq, calls cp_clean_rings, thus
> plain dev_kfree_skb if a skb is still referenced in one of the
> rx/tx ring. You may replace it with dev_kfree_skb_any.
Well spotted; I've made that change locally. Although I don't think it
explains the symptoms. Not that I'm sure what *could*.
I've also found that adding a call to __cp_set_rx_mode() seems to fix
the RX after reset, in some tests. Especially the simulated one via the
hack in cp_set_wol(). I think that's necessary, if not sufficient — at
least on real hardware. I didn't see the problem at all when running in
qemu.
Sometimes, though, it still dies in an interrupt storm after re
-enabling IRQs:
[ 900.004214] 8139cp 0000:00:0b.0 eth1: Transmit timeout, status c 2b 0 80ff
[ 900.011725] will lock...
[ 900.014273] Handling tx timeout, flags 200296
[ 900.018774] Will wake queue...
[ 900.021645] Will unlock... flags 200296
[ 900.021645] 8139cp 0000:00:0b.0 eth1: intr, status 0001 enable 80ff cmd 0c cpcmd 002b
[ 900.021645] 8139cp 0000:00:0b.0 eth1: intr, status 0001 enable 80ff cmd 0c cpcmd 002b
...
[ 901.628439] 8139cp 0000:00:0b.0 eth1: intr, status 0001 enable 80ff cmd 0c cpcmd 002b
[ 901.636291] 8139cp 0000:00:0b.0 eth1: intr, status 0011 enable 80ff cmd 0c cpcmd 002b
...
[ 901.966243] 8139cp 0000:00:0b.0 eth1: intr, status 0011 enable 80ff cmd 0c cpcmd 002b
[ 901.968353] 8139cp 0000:00:0b.0 eth1: intr, status 0051 enable 80ff cmd 0c cpcmd 002b
... forever...
And of course, even if I fix the TX timeout handling, I'd still like to
know why it's happening in the first place...
--
dwmw2
Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5691 bytes)
Powered by blists - more mailing lists