lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 16 Jun 2008 13:37:06 -0700
From:	"Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@...el.com>
To:	"Denys Fedoryshchenko" <denys@...p.net.lb>,
	<netdev@...r.kernel.org>
Cc:	"Linux NICS" <linuxnics@...lbox.intel.com>
Subject: RE: packetloss, on e1000e worse than r8169?

>MegaRouter-KARAM /sys # ethtool -S eth1
>NIC statistics:
>     rx_packets: 109977509
>     tx_packets: 109887692
>     rx_bytes: 57656749138
>     tx_bytes: 57536071746
>     rx_broadcast: 6497
>     tx_broadcast: 92
>     rx_multicast: 48995
>     tx_multicast: 1960
>     rx_errors: 0
>     tx_errors: 0
>     tx_dropped: 0
>     multicast: 48995
>     collisions: 0
>     rx_length_errors: 0
>     rx_over_errors: 0
>     rx_crc_errors: 0
>     rx_frame_errors: 0
>     rx_no_buffer_count: 1796
>     rx_missed_errors: 2182679

This is an indication here that your host isn't processing your Rx fast
enough, and your Rx ring is out of descriptors.  Hence, your hardware is
needing to drop the packet.  What's disturbing is that you actually do
have flow control packets being processed, so the NIC is trying to help
the host.

>     tx_aborted_errors: 0
>     tx_carrier_errors: 0
>     tx_fifo_errors: 0
>     tx_heartbeat_errors: 0
>     tx_window_errors: 0
>     tx_abort_late_coll: 0
>     tx_deferred_ok: 55617
>     tx_single_coll_ok: 0
>     tx_multi_coll_ok: 0
>     tx_timeout_count: 0
>     tx_restart_queue: 1626
>     rx_long_length_errors: 0
>     rx_short_length_errors: 0
>     rx_align_errors: 0
>     tx_tcp_seg_good: 0
>     tx_tcp_seg_failed: 0
>     rx_flow_control_xon: 55461
>     rx_flow_control_xoff: 57329
>     tx_flow_control_xon: 39114
>     tx_flow_control_xoff: 48341
>     rx_long_byte_count: 57656749138
>     rx_csum_offload_good: 104097306
>     rx_csum_offload_errors: 2209

This is also a bit disturbing, that Rx CSUM offload is running into
issues.  I think though this is due to the rx_no_buffer_count.

I see in a followup email you tried increasing your ring size to 4096
descriptors.  I'd suggest trying 512 descriptors; something slow,
instead of going for 4096 out of the gate.  However, if your host can't
keep up with 256 descriptors, I think you're just going to prolong your
problem by increasing your descriptor ring size.  But I don't know what
the profile of your traffic is, so perhaps bumping up the descriptor
ring size to 512 or even 1024 descriptors might help.

Cheers,
-PJ Waskiewicz
<peter.p.waskiewicz.jr@...el.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ