[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080617.022909.173003136.davem@davemloft.net>
Date: Tue, 17 Jun 2008 02:29:09 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: mingo@...e.hu
Cc: kuznet@....inr.ac.ru, vgusev@...nvz.org, mcmanus@...ksong.com,
xemul@...nvz.org, netdev@...r.kernel.org,
ilpo.jarvinen@...sinki.fi, linux-kernel@...r.kernel.org,
e1000-devel@...ts.sourceforge.net, rjw@...k.pl
Subject: Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets
From: Ingo Molnar <mingo@...e.hu>
Date: Tue, 17 Jun 2008 11:27:06 +0200
> when i originally reported it i debugged it back to missing e1000 TX
> completion IRQs. I tried various versions of the driver to figure out
> whether new workarounds for e1000 cover it but it was fruitless. There
> is a 1000 msec internal watchdog timer IRQ within e1000 that gets things
> going if it's stuck.
Then that explains your latency, the chip is getting stuck and
TX interrupts stop, right.
> But the line sch_generic.c:222 problem is new. It could be an
> escallation of this same problem - not even the hw-internal watchdog
> timeout fixing up things? So basically two levels of completion failed,
> the third fallback level (a hard reset of the interface) helped things
> get going. High score from me for networking layer robustness :-)
I think it is an escallation of the same problem. My first thought
is that there must have been some change to the reset logic and it
isn't as foolproof as it used to be, especially under load.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists