lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.1111071109410.18030@hs20-bc2-1.build.redhat.com>
Date:	Mon, 7 Nov 2011 11:42:11 -0500 (EST)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Stephen Hemminger <shemminger@...ux-foundation.org>
cc:	netdev@...r.kernel.org
Subject: data corruption in skge hardware

Hi

I found a data corruption in skge network card.

The card is this: "03:06.0 Ethernet controller: 3Com Corporation 3c940 
10/100/1000Base-T [Marvell] (rev 10)"

The machine is two quad core Opterons with HT2000 north bridge and HT1000 
south bridge.

When "scatter-gather" and "generic-segmentation-offload" are enabled, the 
card sends out corrupted packets.

It normally manifests as a ssh connection drop once per few days, but I 
found a workload that triggers this bug quickly.

I ran tcpdump on both sending and receiving machine and caught the packet 
corruption:

correct packet (on the sending machine):
19:03:21.131836 IP hydra.ssh > phoebe.58913: Flags [P.], seq 53712:53808, 
ack 1, win 193, options [nop,nop,TS val 8677173 ecr 1211608], length 96
        0x0000:  4510 0094 c7bf 4000 4006 f12d c0a8 8007
        0x0010:  c0a8 800e 0016 e621 2d64 84e6 1fc2 3f5b
        0x0020:  8018 00c1 81ed 0000 0101 080a 0084 6735
        0x0030:  0012 7cd8 4301 4af9 87c9 d2b4 8ba6 aedb
        0x0040:  0572 1738 93db 789c 634b 4386 d013 db27
        0x0050:  258b 6fa6 743c d429 a5e1 162f 2721 19bf
        0x0060:  6669 a5c3 6bea 89ec a635 b8b4 8727 38c1
        0x0070:  139f 5989 781b 49dd 79f5 4dfe 78ac ecb0
        0x0080:  546c 33e0 0953 04bc 0647 a9d4 2fc4 cba0
        0x0090:  44b2 3b01

incorrect packet (on the receiving machine):
19:03:21.133174 IP hydra.ssh > phoebe.58913: Flags [P.], seq 53712:53808, 
ack 1, win 193, options [nop,nop,TS val 8677173 ecr 1211608], length 96
        0x0000:  4510 0094 c7bf 4000 4006 f12d c0a8 8007
        0x0010:  c0a8 800e 0016 e621 2d64 84e6 1fc2 3f5b
        0x0020:  8018 00c1 6aa4 0000 0101 080a 0084 6735
        0x0030:  0012 7cd8 0000 0000 0000 0000 0010 0000
        0x0040:  0000 0000 0000 0000 0000 0000 0000 0000
        0x0050:  0000 0000 0000 0000 0000 00c0 dc92 4702
        0x0060:  88ff ff00 0000 0000 0000 0000 0000 0000
        0x0070:  0000 0000 0000 0000 0000 0000 0000 0000
        0x0080:  0000 0000 0000 0000 0000 0000 0000 0000
        0x0090:  0000 00e0

Obviously, scatter-gather doesn't work, the header is correct, but the 
packet body was likely read from random memory.

I tried to use "clflush" instruction on the transmit descriptor and the 
packet body to test if it is a cache-coherency issue, but the corruption 
was still there.

I tried to limit memory to 2G to test if it was a problem with high 
memory, but the corruption was still there.

I tries olded kernels (as far as 2.6.34), the corruption was still there, 
but it took much more time to trigger it with old kernels.


Do you have other reports of data corruption with skge hardware? Shouldn't 
the driver set "scatter-gather" off by default because it is unreliable?

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ