lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130215232752.GA19796@linuxace.com>
Date:	Fri, 15 Feb 2013 15:27:52 -0800
From:	Phil Oester <kernel@...uxace.com>
To:	netdev@...r.kernel.org
Cc:	eric.dumazet@...il.com
Subject: 3.7 networking regression - bisected

Since upgrading a box to 3.7, I've been seeing e1000e issues on one box:

	e1000e 0000:00:19.0: eth2: Detected Hardware Unit Hang

While 3.6 worked fine.  I bisected the problem down to commit
69b08f62e17439ee3d436faf0b9a7ca6fffb78db ("net: use bigger pages in
__netdev_alloc_frag").  Running 3.7.x without that commit is 
currently working fine.  

I have other boxes on 3.7 with e1000e that are not experiencing this
problem.  The difference on this box is it runs at 100mb, which
disables TSO:

	e1000e 0000:00:19.0 eth2: 10/100 speed: disabling TSO

So maybe the problem doesn't occur with TSO enabled?  Either that,
or there is something unique about the traffic pattern this box is seeing.

Full log entries below.

Phil Oester



Feb 15 17:38:19 char1 kernel: e1000e 0000:00:19.0: eth2: Detected Hardware Unit Hang:
Feb 15 17:38:19 char1 kernel:  TDH                  <f>
Feb 15 17:38:19 char1 kernel:  TDT                  <22>
Feb 15 17:38:19 char1 kernel:  next_to_use          <22>
Feb 15 17:38:19 char1 kernel:  next_to_clean        <d>
Feb 15 17:38:19 char1 kernel: buffer_info[next_to_clean]:
Feb 15 17:38:19 char1 kernel:  time_stamp           <10001bb67>
Feb 15 17:38:19 char1 kernel:  next_to_watch        <f>
Feb 15 17:38:19 char1 kernel:  jiffies              <10001bfa9>
Feb 15 17:38:19 char1 kernel:  next_to_watch.status <0>
Feb 15 17:38:19 char1 kernel: MAC Status             <80243>
Feb 15 17:38:19 char1 kernel: PHY Status             <792d>
Feb 15 17:38:19 char1 kernel: PHY 1000BASE-T Status  <0>
Feb 15 17:38:19 char1 kernel: PHY Extended Status    <3000>
Feb 15 17:38:19 char1 kernel: PCI Status             <10>
Feb 15 17:38:21 char1 kernel: e1000e 0000:00:19.0: eth2: Detected Hardware Unit Hang:
Feb 15 17:38:21 char1 kernel:  TDH                  <f>
Feb 15 17:38:21 char1 kernel:  TDT                  <22>
Feb 15 17:38:21 char1 kernel:  next_to_use          <22>
Feb 15 17:38:21 char1 kernel:  next_to_clean        <d>
Feb 15 17:38:21 char1 kernel: buffer_info[next_to_clean]:
Feb 15 17:38:21 char1 kernel:  time_stamp           <10001bb67>
Feb 15 17:38:21 char1 kernel:  next_to_watch        <f>
Feb 15 17:38:21 char1 kernel:  jiffies              <10001c071>
Feb 15 17:38:21 char1 kernel:  next_to_watch.status <0>
Feb 15 17:38:21 char1 kernel: MAC Status             <80243>
Feb 15 17:38:21 char1 kernel: PHY Status             <792d>
Feb 15 17:38:21 char1 kernel: PHY 1000BASE-T Status  <0>
Feb 15 17:38:21 char1 kernel: PHY Extended Status    <3000>
Feb 15 17:38:21 char1 kernel: PCI Status             <10>
Feb 15 17:38:23 char1 kernel: e1000e 0000:00:19.0: eth2: Detected Hardware Unit Hang:
Feb 15 17:38:23 char1 kernel:  TDH                  <f>
Feb 15 17:38:23 char1 kernel:  TDT                  <22>
Feb 15 17:38:23 char1 kernel:  next_to_use          <22>
Feb 15 17:38:23 char1 kernel:  next_to_clean        <d>
Feb 15 17:38:23 char1 kernel: buffer_info[next_to_clean]:
Feb 15 17:38:23 char1 kernel:  time_stamp           <10001bb67>
Feb 15 17:38:23 char1 kernel:  next_to_watch        <f>
Feb 15 17:38:23 char1 kernel:  jiffies              <10001c139>
Feb 15 17:38:23 char1 kernel:  next_to_watch.status <0>
Feb 15 17:38:23 char1 kernel: MAC Status             <80243>
Feb 15 17:38:23 char1 kernel: PHY Status             <792d>
Feb 15 17:38:23 char1 kernel: PHY 1000BASE-T Status  <0>
Feb 15 17:38:23 char1 kernel: PHY Extended Status    <3000>
Feb 15 17:38:23 char1 kernel: PCI Status             <10>
Feb 15 17:38:25 char1 kernel: e1000e 0000:00:19.0: eth2: Detected Hardware Unit Hang:
Feb 15 17:38:25 char1 kernel:  TDH                  <f>
Feb 15 17:38:25 char1 kernel:  TDT                  <22>
Feb 15 17:38:25 char1 kernel:  next_to_use          <22>
Feb 15 17:38:25 char1 kernel:  next_to_clean        <d>
Feb 15 17:38:25 char1 kernel: buffer_info[next_to_clean]:
Feb 15 17:38:25 char1 kernel:  time_stamp           <10001bb67>
Feb 15 17:38:25 char1 kernel:  next_to_watch        <f>
Feb 15 17:38:25 char1 kernel:  jiffies              <10001c201>
Feb 15 17:38:25 char1 kernel:  next_to_watch.status <0>
Feb 15 17:38:25 char1 kernel: MAC Status             <80243>
Feb 15 17:38:25 char1 kernel: PHY Status             <792d>
Feb 15 17:38:25 char1 kernel: PHY 1000BASE-T Status  <0>
Feb 15 17:38:25 char1 kernel: PHY Extended Status    <3000>
Feb 15 17:38:25 char1 kernel: PCI Status             <10>
Feb 15 17:38:27 char1 kernel: e1000e 0000:00:19.0: eth2: Detected Hardware Unit Hang:
Feb 15 17:38:27 char1 kernel:  TDH                  <f>
Feb 15 17:38:27 char1 kernel:  TDT                  <22>
Feb 15 17:38:27 char1 kernel:  next_to_use          <22>
Feb 15 17:38:27 char1 kernel:  next_to_clean        <d>
Feb 15 17:38:27 char1 kernel: buffer_info[next_to_clean]:
Feb 15 17:38:27 char1 kernel:  time_stamp           <10001bb67>
Feb 15 17:38:27 char1 kernel:  next_to_watch        <f>
Feb 15 17:38:27 char1 kernel:  jiffies              <10001c2c9>
Feb 15 17:38:27 char1 kernel:  next_to_watch.status <0>
Feb 15 17:38:27 char1 kernel: MAC Status             <80243>
Feb 15 17:38:27 char1 kernel: PHY Status             <792d>
Feb 15 17:38:27 char1 kernel: PHY 1000BASE-T Status  <0>
Feb 15 17:38:27 char1 kernel: PHY Extended Status    <3000>
Feb 15 17:38:27 char1 kernel: PCI Status             <10>
Feb 15 17:38:28 char1 kernel: ------------[ cut here ]------------
Feb 15 17:38:28 char1 kernel: WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x239/0x250()
Feb 15 17:38:28 char1 kernel: Hardware name: OptiPlex 755                 
Feb 15 17:38:28 char1 kernel: NETDEV WATCHDOG: eth2 (e1000e): transmit queue 0 timed out
Feb 15 17:38:28 char1 kernel: Modules linked in: iptable_nat nf_nat_ipv4 xt_LOG xt_limit xt_pkttype xt_tcpudp xt_state xt_iprange xt_multiport iptable_filter ip_tables x_tables nf_nat_tftp nf_nat_ftp nf_nat nf_conntrack_tftp nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack e1000e e1000 lpc_ich coretemp hwmon mfd_core ahci libahci
Feb 15 17:38:28 char1 kernel: Pid: 0, comm: swapper/1 Not tainted 3.6.0-2.fc16.x86_64 #1
Feb 15 17:38:28 char1 kernel: Call Trace:
Feb 15 17:38:28 char1 kernel: <IRQ>  [<ffffffff8128c300>] ? dev_watchdog+0x170/0x250
Feb 15 17:38:28 char1 kernel: [<ffffffff8102ac6b>] ? warn_slowpath_common+0x7b/0xc0
Feb 15 17:38:28 char1 kernel: [<ffffffff8102ad65>] ? warn_slowpath_fmt+0x45/0x50
Feb 15 17:38:28 char1 kernel: [<ffffffff8128c3c9>] ? dev_watchdog+0x239/0x250
Feb 15 17:38:28 char1 kernel: [<ffffffff81036bb6>] ? run_timer_softirq+0x106/0x230
Feb 15 17:38:28 char1 kernel: [<ffffffff8128c190>] ? pfifo_fast_dequeue+0xe0/0xe0
Feb 15 17:38:28 char1 kernel: [<ffffffff81031af8>] ? __do_softirq+0xa8/0x150
Feb 15 17:38:28 char1 kernel: [<ffffffff812fd3ec>] ? call_softirq+0x1c/0x26
Feb 15 17:38:28 char1 kernel: [<ffffffff81003a1d>] ? do_softirq+0x4d/0x80
Feb 15 17:38:28 char1 kernel: [<ffffffff81031e0e>] ? irq_exit+0x8e/0xb0
Feb 15 17:38:28 char1 kernel: [<ffffffff8101c628>] ? smp_apic_timer_interrupt+0x68/0xa0
Feb 15 17:38:28 char1 kernel: [<ffffffff812fcd07>] ? apic_timer_interrupt+0x67/0x70
Feb 15 17:38:28 char1 kernel: <EOI>  [<ffffffff81061690>] ? __tick_nohz_idle_enter+0x330/0x410
Feb 15 17:38:28 char1 kernel: [<ffffffff810098a1>] ? mwait_idle+0x51/0x70
Feb 15 17:38:28 char1 kernel: [<ffffffff8100a1b6>] ? cpu_idle+0x86/0xd0
Feb 15 17:38:28 char1 kernel: ---[ end trace 50a96207e385a66c ]---
Feb 15 17:38:28 char1 kernel: e1000e 0000:00:19.0: eth2: Reset adapter
Feb 15 17:38:29 char1 kernel: e1000e: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ