lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Nov 2011 08:50:30 -0800
From:	Stephen Hemminger <shemminger@...tta.com>
To:	Vincent Blut <vincent.debian@...e.fr>
Cc:	netdev@...r.kernel.org,
	Debian Bug Tracking System <609994@...s.debian.org>
Subject: Re: sky2: hw csum failure

On Mon, 28 Nov 2011 12:10:20 +0000
Vincent Blut <vincent.debian@...e.fr> wrote:

> Hi,
> 
> [reference: http://bugs.debian.org/609994]
> 
> I have a Marvell ethernet controller which presents some failures when
> 'rx checksumming' is enabled,
> here is the model:
> 
> $ lspci -vvs 03:00.0
> 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E
> Gigabit Ethernet Controller (rev 15)
>         Subsystem: Micro-Star International Co., Ltd. Marvell 88E8053
> Gigabit Ethernet Controller (MSI)
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 32 bytes
>         Interrupt: pin A routed to IRQ 44
>         Region 0: Memory at fdbfc000 (64-bit, non-prefetchable) [size=16K]
>         Region 2: I/O ports at 7c00 [size=256]
>         [virtual] Expansion ROM at fda00000 [disabled] [size=128K]
>         Capabilities: <access denied>
>         Kernel driver in use: sky2
> 
> At first I thought it was due to the MTU size, so I tested different
> values but unfortunately without positive effect.
> Overall this issue appears randomly when the incoming traffic is high. I
> tested 2.6.32, 3.1.1, and 3.2-rc3, sadly
> all are affected. Finally, the only way to avoid those failures is to
> disabled 'rx checksumming' (ethtool -K ethX rx off).
> 
> Here is the stack trace:
> 
> [   14.615648] sky2 0000:03:00.0: eth1: enabling interface
> [   14.616452] ADDRCONF(NETDEV_UP): eth1: link is not ready
> [   17.094194] sky2 0000:03:00.0: eth1: Link is up at 1000 Mbps, full
> duplex, flow control both
> [   17.094887] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> [   28.080018] eth1: no IPv6 routers present
> [  563.816032] sky2 0000:03:00.0: eth1: hung mac 124:22 fifo 195 (150:145)
> [  563.816036] sky2 0000:03:00.0: eth1: receiver hang detected
> [  567.005422] sky2 0000:03:00.0: eth1: Link is up at 1000 Mbps, full
> duplex, flow control both
> [ 1040.816314] sky2 0000:03:00.0: eth1: rx error, status 0x7ffc0001
> length 1004
> [ 2097.401616] sky2 0000:03:00.0: eth1: rx error, status 0x39a339a3 length 0

This isn't really a hardware checksum failure.
Your problem is deeper than that. The internal parts of the chip are not
communicating correctly. The "hung mac" is a problem only occurs if the PCI
is really stuck. There may be a timing issue on your motherboard, or the BIOS
isn't setting up the device properly. The timing then gets messed up between
the end of frame status and the PCI shared memory region. Turning checksum
off masks the problem, but the status is probably still corrupt.

In either case the problem is beyond the ability of the driver to fix or workaround.
Your best bet is to see if there is a BIOS update, or replace the hardware.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ