lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 Nov 2007 12:51:33 -0500
From:	Tony Battersby <tonyb@...ernetics.com>
To:	shemminger@...ux-foundation.org, netdev@...r.kernel.org
Subject: BUG: sky2: hw csum failure with dual-port copper NIC on SMP

I am getting "hw csum failure" messages with sky2.  I have seen this
problem reported elsewhere with a fibre NIC, but I am using a copper
NIC.  It seems to be triggered by SMP.  It is easy to reproduce in
2.6.23.  2.6.24-rc2-git3 still has the problem, but it happens less
frequently.

To reproduce the problem, I am using a simple network benchmark program
that I wrote that basically does send()/recv() as fast as possible using
a memory buffer (null data, no disk I/O, no data integrity checking).
The computer with the SysKonnect NIC acts as the server.  I have two
other computers with Intel PRO/1000 NICs that are directly cabled to the
two ports on the SysKonnect NIC.  Each of them runs the client program,
which connects to the server, send()s 10 GB, and then recv()s 10 GB.
Essentially, both ports on the Syskonnect NIC are receiving at the
maximum rate for a few minutes, and then transmitting at the maximum
rate for a few minutes.  Sustained throughput is about 117 MB/s on both
ports simultaneously.

The "hw csum failure" does not seem to affect the test.  send()/recv()
continue to work normally.  Nothing locks up.

I get several "hw csum failure" messages per minute on 2.6.23-SMP.  The
error does not happen with 2.6.23 if I boot with "max_cpus=1".  The
message seems less frequent with 2.6.24-SMP, but it still happens once
every minute or so.

The "hw csum failure" message does not happen when only one port is in
use.  You have to stress both ports simultaneously to reproduce the
problem.

Another cosmetic issue is that "ifconfig" shows eth2 at IRQ 16 and eth3
at IRQ 218, when in fact both are at IRQ 218.  IRQ 16 is the regular
interrupt line and IRQ 218 is the MSI interrupt.  I imagine that the
driver is just reporting the IRQ incorrectly in this case.  It is just a
minor cosmetic issue which doesn't break anything.

Let me know if I can be of any further assistance in tracking down this
problem.

NIC: Syskonnect SK-9E22 dual-port copper PCI-express
motherboard: SuperMicro PDSME
CPU: Pentium D 945 (dual-core 3.4 GHz)
kernel versions: 2.6.23 and 2.6.24-rc2-git3

All information below is from 2.6.24-rc2-git3.

portion of dmesg showing error:
<unknown>: hw csum failure.
 [<c02c0910>] skb_copy_and_csum_datagram_iovec+0x120/0x130
 [<c0180913>] __set_page_dirty+0x83/0x140
 [<c02ef2c1>] tcp_rcv_established+0x981/0x9a0
 [<c02f6490>] tcp_v4_do_rcv+0xc0/0x370
 [<c02ba042>] release_sock+0x12/0xa0
 [<c02bb0f1>] sk_wait_data+0xa1/0xd0
 [<c02e3ef8>] tcp_prequeue_process+0x48/0x70
 [<c02e4ea1>] tcp_recvmsg+0x671/0xc50
 [<c0117bc3>] enqueue_task_fair+0x73/0xb0
 [<c02ba305>] sock_common_recvmsg+0x45/0x70
 [<c02b98d8>] sock_recvmsg+0xd8/0x130
 [<c012eef0>] autoremove_wake_function+0x0/0x50
 [<c0120d62>] __do_softirq+0x82/0x100
 [<c0120f12>] irq_exit+0x52/0x90
 [<c010f6b4>] smp_apic_timer_interrupt+0x54/0x80
 [<c02b9c6b>] sys_recvfrom+0xeb/0x180
 [<c0111cea>] read_hpet+0xa/0x10
 [<c01347f0>] getnstimeofday+0x40/0xf0
 [<c0118c20>] rebalance_domains+0x110/0x3e0
 [<c02b9d33>] sys_recv+0x33/0x40
 [<c02b9ea5>] sys_socketcall+0x165/0x280
 [<c0102a4e>] sysenter_past_esp+0x5f/0x85
 =======================

dmesg | grep sky2
sky2 0000:04:00.0: v1.20 addr 0xea300000 irq 16 Yukon-XL (0xb3) rev 1
sky2 0000:04:00.0: PCI Express Advanced Error Reporting not configured or MMCONFIG problem?
sky2 eth2: addr 00:00:5a:72:b8:91
sky2 eth3: addr 00:00:5a:72:b8:92
sky2 eth2: enabling interface
sky2 eth3: enabling interface
sky2 eth2: Link is up at 1000 Mbps, full duplex, flow control both
sky2 eth3: Link is up at 1000 Mbps, full duplex, flow control both

ifconfig
eth2      Link encap:Ethernet  HWaddr 00:00:5A:72:B8:91  
          inet addr:192.168.1.10  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:34910877 errors:0 dropped:0 overruns:0 frame:0
          TX packets:22659597 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:3207874526 (2.9 GiB)  TX bytes:2888042042 (2.6 GiB)
          Interrupt:16 

eth3      Link encap:Ethernet  HWaddr 00:00:5A:72:B8:92  
          inet addr:137.157.10.224  Bcast:137.157.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:34902414 errors:0 dropped:0 overruns:0 frame:0
          TX packets:22641940 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:3207442696 (2.9 GiB)  TX bytes:2886952355 (2.6 GiB)
          Interrupt:218 

ethtool -i eth2
driver: sky2
version: 1.20
firmware-version: N/A
bus-info: 0000:04:00.0

ethtool eth2
Settings for eth2:
	Supported ports: [ TP ]
	Supported link modes:   10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Half 1000baseT/Full 
	Supports auto-negotiation: Yes
	Advertised link modes:  10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Half 1000baseT/Full 
	Advertised auto-negotiation: Yes
	Speed: 1000Mb/s
	Duplex: Full
	Port: Twisted Pair
	PHYAD: 0
	Transceiver: internal
	Auto-negotiation: on
	Supports Wake-on: pg
	Wake-on: d
	Current message level: 0x000000ff (255)
	Link detected: yes

ethtool -k eth2
Offload parameters for eth2:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on

ethtool -S eth2
NIC statistics:
     tx_bytes: 33946810766
     rx_bytes: 33901041384
     tx_broadcast: 0
     rx_broadcast: 1
     tx_multicast: 0
     rx_multicast: 0
     tx_unicast: 35564726
     rx_unicast: 34910876
     tx_mac_pause: 0
     rx_mac_pause: 0
     collisions: 0
     late_collision: 0
     aborted: 0
     single_collisions: 0
     multi_collisions: 0
     rx_short: 0
     rx_runt: 0
     rx_64_byte_packets: 13
     rx_65_to_127_byte_packets: 13166182
     rx_128_to_255_byte_packets: 5
     rx_256_to_511_byte_packets: 6049
     rx_512_to_1023_byte_packets: 23940
     rx_1024_to_1518_byte_packets: 21714688
     rx_1518_to_max_byte_packets: 0
     rx_too_long: 0
     rx_fifo_overflow: 0
     rx_jabber: 0
     rx_fcs_error: 0
     tx_64_byte_packets: 13
     tx_65_to_127_byte_packets: 10811129
     tx_128_to_255_byte_packets: 873915
     tx_256_to_511_byte_packets: 955169
     tx_512_to_1023_byte_packets: 2245568
     tx_1024_to_1518_byte_packets: 20678932
     tx_1519_to_max_byte_packets: 0
     tx_fifo_underrun: 0

ethtool -i eth3
driver: sky2
version: 1.20
firmware-version: N/A
bus-info: 0000:04:00.0

ethtool eth3
Settings for eth3:
	Supported ports: [ TP ]
	Supported link modes:   10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Half 1000baseT/Full 
	Supports auto-negotiation: Yes
	Advertised link modes:  10baseT/Half 10baseT/Full 
	                        100baseT/Half 100baseT/Full 
	                        1000baseT/Half 1000baseT/Full 
	Advertised auto-negotiation: Yes
	Speed: 1000Mb/s
	Duplex: Full
	Port: Twisted Pair
	PHYAD: 0
	Transceiver: internal
	Auto-negotiation: on
	Supports Wake-on: pg
	Wake-on: d
	Current message level: 0x000000ff (255)
	Link detected: yes

ethtool -k eth3
Offload parameters for eth3:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on

ethtool -S eth3
NIC statistics:
     tx_bytes: 33948750825
     rx_bytes: 33900457220
     tx_broadcast: 0
     rx_broadcast: 137
     tx_multicast: 0
     rx_multicast: 0
     tx_unicast: 35591358
     rx_unicast: 34902277
     tx_mac_pause: 0
     rx_mac_pause: 0
     collisions: 31
     late_collision: 0
     aborted: 0
     single_collisions: 29
     multi_collisions: 1
     rx_short: 0
     rx_runt: 0
     rx_64_byte_packets: 64
     rx_65_to_127_byte_packets: 13151060
     rx_128_to_255_byte_packets: 23
     rx_256_to_511_byte_packets: 7867
     rx_512_to_1023_byte_packets: 36713
     rx_1024_to_1518_byte_packets: 21706687
     rx_1518_to_max_byte_packets: 0
     rx_too_long: 0
     rx_fifo_overflow: 0
     rx_jabber: 0
     rx_fcs_error: 0
     tx_64_byte_packets: 21
     tx_65_to_127_byte_packets: 10750614
     tx_128_to_255_byte_packets: 945463
     tx_256_to_511_byte_packets: 1004551
     tx_512_to_1023_byte_packets: 2153163
     tx_1024_to_1518_byte_packets: 20737546
     tx_1519_to_max_byte_packets: 0
     tx_fifo_underrun: 0

cat /proc/interrupts
           CPU0       CPU1       
  0:         89          0   IO-APIC-edge      timer
  1:        207          0   IO-APIC-edge      i8042
  7:          0          0   IO-APIC-edge      parport0
  8:          3          0   IO-APIC-edge      rtc
  9:          0          0   IO-APIC-fasteoi   acpi
 12:          5          0   IO-APIC-edge      i8042
 14:        784          0   IO-APIC-edge      ide0
 16:          0          0   IO-APIC-fasteoi   uhci_hcd:usb5
 18:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
 20:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
218:    4482759    4446537   PCI-MSI-edge      eth2
219:          0          0   PCI-MSI-edge      ahci
NMI:          0          0   Non-maskable interrupts
LOC:      65542      48825   Local timer interrupts
RES:        226         59   Rescheduling interrupts
CAL:         80         60   function call interrupts
TLB:         22         52   TLB shootdowns
TRM:          0          0   Thermal event interrupts
SPU:          0          0   Spurious interrupts
ERR:          0
MIS:          0

lspci -vv
04:00.0 0200: 1148:9e00 (rev 14)
04:00.0 Ethernet controller: SysKonnect SK-9Exx 10/100/1000Base-T Adapter (rev 14)
	Subsystem: SysKonnect SK-9E22 Server Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
	Latency: 0, Cache Line Size 08
	Interrupt: pin A routed to IRQ 218
	Region 0: Memory at ea300000 (64-bit, non-prefetchable) [size=16K]
	Region 2: I/O ports at 8000 [size=256]
	[virtual] Expansion ROM at ea320000 [disabled] [size=128K]
	Capabilities: [48] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [50] Vital Product Data
	Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable+
		Address: 00000000fee0200c  Data: 413a
	Capabilities: [e0] Express Legacy Endpoint IRQ 0
		Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
		Device: Latency L0s unlimited, L1 unlimited
		Device: AtnBtn- AtnInd- PwrInd-
		Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
		Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop-
		Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
		Link: Supported Speed 2.5Gb/s, Width x4, ASPM L0s, Port 0
		Link: Latency L0s <256ns, L1 unlimited
		Link: ASPM Disabled RCB 128 bytes CommClk- ExtSynch-
		Link: Speed 2.5Gb/s, Width x4



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ