netdev - Re: [Bugme-new] [Bug 14737] New: e1000e driver experiences large packet losses

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <81bfc67a0912071420w3efd9f06x24e18a9179f29ea7@mail.gmail.com>
Date:	Mon, 7 Dec 2009 17:20:43 -0500
From:	Caleb Cushing <xenoterracide@...il.com>
To:	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>
Cc:	"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
	"Allan, Bruce W" <bruce.w.allan@...el.com>,
	"Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@...el.com>,
	"Ronciak, John" <john.ronciak@...el.com>,
	"bugzilla-daemon@...zilla.kernel.org" 
	<bugzilla-daemon@...zilla.kernel.org>,
	"bugme-daemon@...zilla.kernel.org" <bugme-daemon@...zilla.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [Bugme-new] [Bug 14737] New: e1000e driver experiences large 
	packet losses

On Mon, Dec 7, 2009 at 4:53 PM, Brandeburg, Jesse
<jesse.brandeburg@...el.com> wrote:
> thanks akpm, I've been watching this thread but now I will try to jump in.
>
> Caleb, can you please summarize where we are today, you've done a lot of
> testing and the thread has gone on a while.
>
> Kernels known to fail (after any length):
2.6.32 - 2.6.29.6 is the range I've tested 2.6.29 only seemed to have
10% packet loss with mtr as opposed to the later, higher 30-50% still
that's abnormal and shouldn't be happening. I haven't tested farther
back yet.

> Kernels known to work:

flawlessly, none at this point. I've been able to replicate on every
version tested. given the fact it doesn't happen on every reboot and I
rarely reboot this makes it difficult to test. Other than the fact
that I've been unable to find a good kernel nothing suggests hardware
failure. given that some of the other e1000e bugs go back farther than
I've tested...

> Have you been able to try the latest e1000e from 2.6.32?  it has some
> fixes in it, although none right off the top of my head that will fix your
> issue.

yes. reproducible. whether it occurs as often I'm not sure.

> I have a couple of related questions, why don't you have irqbalance
> enabled?  Network interrupts should not be migrating across all cpus
> evenly, at the very least your system should be reconfigured to lock the
> interrupts to a particular core with smp_affinity.

is that new with 32? if not I don't know... I'm using arch linux's
config as a base, if it's something they should have enabled I can
relay the message.

> There is nothing in the ethtool -S statistics that I see that indicates
> anything is wrong, you've gotten no tx timeouts as far as I can tell, have
> you had any system panics (possibly seeming unrelated to network?)

no. My system seems perfectly stable (outside of some end user
software bugs, and even then only kopete seems to crash these days,
due to me using an experimental protocol). I'm unable to account for
the fact that tests aren't accounting for anything wrong...

hmm... thought... possibly iptables is dropping them as INVALID? I'm
still thinking that testing on just this system with one nic hooked
into the other might be a good idea, as the firewall configuration in
openwrt is not straightforward to me, this would also remove any QoS
rules that the router is applying, and random packets floating around
(that windows boxen are sending).
-- 
Caleb Cushing

http://xenoterracide.blogspot.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html