[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <F169D4F5E1F1974DBFAFABF47F60C10A0E873C1B@orsmsx507.amr.corp.intel.com>
Date: Mon, 3 Nov 2008 10:49:53 -0800
From: "Brandeburg, Jesse" <jesse.brandeburg@...el.com>
To: Milan Kocian <milon@...cz>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: "e1000-devel@...ts.sourceforge.net"
<e1000-devel@...ts.sourceforge.net>
Subject: RE: WARNING: at net/sched/sch_generic.c:219
Milan Kocian wrote:
> hello,
>
> for your info I have here one warning. Kernel 2.6.27.4.
> Machine have minimal load and traffic (I can say, it does nothing :-).
> Probably its related to link up/down. (kvm module is only loaded, not
> used)
what kind of hardware do you have? lspci -vvv output please. you're
running a 64 bit kernel, does it happen to be an AMD system with a
hypertransport to PCI express bridge and >= 4GB ram?
> Nov 1 19:46:40 srw1 kernel: 0000:0d:00.0: eth0: Detected Tx Unit
> Hang:
> Nov 1 19:46:40 srw1 kernel: TDH <bf>
> Nov 1 19:46:40 srw1 kernel: TDT <cd>
> Nov 1 19:46:40 srw1 kernel: next_to_use <cd>
> Nov 1 19:46:40 srw1 kernel: next_to_clean <bf>
> Nov 1 19:46:40 srw1 kernel: buffer_info[next_to_clean]:
> Nov 1 19:46:40 srw1 kernel: time_stamp <1020c297e>
> Nov 1 19:46:40 srw1 kernel: next_to_watch <bf>
> Nov 1 19:46:40 srw1 kernel: jiffies <1020c3cb4>
> Nov 1 19:46:40 srw1 kernel: next_to_watch.status <0>
> Nov 1 19:46:42 srw1 kernel: 0000:0d:00.0: eth0: Detected Tx Unit
> Hang:
> Nov 1 19:46:42 srw1 kernel: TDH <bf>
> Nov 1 19:46:42 srw1 kernel: TDT <cd>
> Nov 1 19:46:42 srw1 kernel: next_to_use <cd>
> Nov 1 19:46:42 srw1 kernel: next_to_clean <bf>
> Nov 1 19:46:42 srw1 kernel: buffer_info[next_to_clean]:
> Nov 1 19:46:42 srw1 kernel: time_stamp <1020c297e>
> Nov 1 19:46:42 srw1 kernel: next_to_watch <bf>
> Nov 1 19:46:42 srw1 kernel: jiffies <1020c3f0c>
> Nov 1 19:46:42 srw1 kernel: next_to_watch.status <0>
this part looks like we are reporting a real tx hang, from hardware's
perspective at least. Are you sure your duplex settings are correct?
are you forcing 10 Full?
> Nov 1 19:46:43 srw1 kernel: ------------[ cut here ]------------
> Nov 1 19:46:43 srw1 kernel: WARNING: at net/sched/sch_generic.c:219
> dev_watchdog+0x22e/0x240() Nov 1 19:46:43 srw1 kernel: NETDEV
> WATCHDOG: eth0 (e1000e): transmit timed out
> Nov 1 19:46:43 srw1 kernel: Modules linked in: kvm_intel kvm ipv6
> loop e1000e
> Nov 1 19:46:43 srw1 kernel: Pid: 0, comm: swapper Not tainted
> 2.6.27.4 #2
> Nov 1 19:46:43 srw1 kernel:
> Nov 1 19:46:43 srw1 kernel: Call Trace:
> Nov 1 19:46:43 srw1 kernel: <IRQ> [<ffffffff8023449d>]
> warn_slowpath+0xcd/0x120
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8023521d>] vprintk+0x16d/0x3c0
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8038b829>] __next_cpu+0x19/0x30
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8022bcec>]
> find_busiest_group+0x1dc/0x950
> Nov 1 19:46:43 srw1 kernel: [<ffffffff803914d1>] strlcpy+0x41/0x50
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8042e49e>]
> dev_watchdog+0x22e/0x240
> Nov 1 19:46:43 srw1 kernel: [<ffffffff80424c56>]
> neigh_periodic_timer+0x146/0x1c0
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8024f178>]
> getnstimeofday+0x48/0xc0
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8042e270>]
> dev_watchdog+0x0/0x240
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8023e6be>]
> run_timer_softirq+0x12e/0x200
> Nov 1 19:46:43 srw1 kernel: [<ffffffff80239e43>]
> __do_softirq+0x73/0xf0
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8020cb8c>]
> call_softirq+0x1c/0x30
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8020eb75>] do_softirq+0x35/0x70
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8021c13e>]
> smp_apic_timer_interrupt+0x7e/0xc0
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8020c5d6>]
> apic_timer_interrupt+0x66/0x70
> Nov 1 19:46:43 srw1 kernel: <EOI> [<ffffffff8021329c>]
> mwait_idle+0x3c/0x50
> Nov 1 19:46:43 srw1 kernel: [<ffffffff8020a86c>] cpu_idle+0x5c/0x90
> Nov 1 19:46:43 srw1 kernel:
> Nov 1 19:46:43 srw1 kernel: ---[ end trace c45f300d9ee4877c ]---
> Nov 1 19:46:45 srw1 kernel: 0000:0d:00.0: eth0: Link is Up 10 Mbps
> Full Duplex, Flow Control: None Nov 1 19:46:45 srw1 kernel:
> 0000:0d:00.0: eth0: 10/100 speed: disabling TSO
> Nov 1 19:48:40 srw1 kernel: 0000:0d:00.0: eth0: Detected Tx Unit
> Hang:
hm, roughly 2 minutes from link up to hang, and at least 0xd7 packets
enqueued (215)
> Nov 1 19:48:40 srw1 kernel: TDH <cc>
> Nov 1 19:48:40 srw1 kernel: TDT <d7>
> Nov 1 19:48:40 srw1 kernel: next_to_use <d7>
> Nov 1 19:48:40 srw1 kernel: next_to_clean <ca>
> Nov 1 19:48:40 srw1 kernel: buffer_info[next_to_clean]:
> Nov 1 19:48:40 srw1 kernel: time_stamp <1020cb4e0>
> Nov 1 19:48:40 srw1 kernel: next_to_watch <cc>
> Nov 1 19:48:40 srw1 kernel: jiffies <1020cc828>
> Nov 1 19:48:40 srw1 kernel: next_to_watch.status <0>
> Nov 1 19:48:43 srw1 kernel: 0000:0d:00.0: eth0: Link is Up 10 Mbps
> Full Duplex, Flow Control: None Nov 1 19:48:43 srw1 kernel:
> 0000:0d:00.0: eth0: 10/100 speed: disabling TSO
we have a patch that can dump the hardware state of the transmit ring in
much more detail, once we get more information maybe we'll have to run
that to figure out whats up.
Thanks for the report,
Jesse
Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (6703 bytes)
Powered by blists - more mailing lists