lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36D9DB17C6DE9E40B059440DB8D95F52032A44D8@orsmsx418.amr.corp.intel.com>
Date:	Mon, 20 Aug 2007 09:21:54 -0700
From:	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>
To:	"Alan J. Wylie" <alan@...ie.me.uk>
Cc:	<e1000-devel@...ts.sourceforge.net>,
	"Linux Network Development list" <netdev@...r.kernel.org>
Subject: RE: skb_pull_rcsum - Fatal exception in interrupt

Alan J. Wylie wrote:
> We have been shipping Linux based servers to customers for several
> years now, with few problems. Recently, however, a single customer has
> been seeing kernel panics. Unfortunately, the customer is about 200
> miles away, so physical access is limited. There are two ethernet
> interfaces, one should be plugged into a local RFC1918 network, the
> other is connected to the internet. If eth0 is plugged into the local
> network, a short time later the system panics.
> 
> Hardware: Intel S5000VSA server
> 
> Network cards: Intel e1000
>    Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper)

Hi Alan, I work on the team that supports e1000, I'd be interested in
seeing the dmesg output from the machine before it crashes, maybe you
can add that to your web collection of data below?

many of the 5000 series machines have BMC's its possible that you could
set up the remote management so you could reboot it remotely, but that
may not be worth the extra effort.  It could however give you the
ability to have a serial console over ethernet, which would get us the
full panic message, but see below.

> # CONFIG_E1000_DISABLE_PACKET_SPLIT is not set
can you try setting the CONFIG_E1000_DISABLE_PACKET_SPLIT=y

this will prevent the driver from splitting the header from the packet
data which could be exacerbating this problem.

Its not immediately obvious whether this is a kernel or driver problem,
I hope you don't mind I cc'd e1000-devel since this is possibly relevant
to other e1000 users and developers.
 
> We shipped a second system, and this displayed identical symptoms.  We
> have tested with several recent 2.6 kernels, including
> 
> 2.6.22
> 2.6.17.14
> 2.6.20.15
> 
> all of which crash.
> 
> We have a couple of photographs showing the tail end of the messages
> on the screen.
> 
> The last two lines are:
> 
> EIP: [<c02b6fb2>] skb_pull_rcsum+0x6d/0x71 SS:ESP 09068:c03e1ea4
> Kernel panic - not syncing: Fatal exception in interrupt

can you boot with vga=0x318 appended to kernel options? this might help
you get more on the screen.   you could also look into netconsole, but
because this is a networking crash I don't know if you'll get data out
of netconsole or not, and I don't know if you can use netconsole over
the 'net' as I've only used it for local logging.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ