[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1298463704.31256.29.camel@krikkit>
Date:	Wed, 23 Feb 2011 13:21:44 +0100
From:	Hans Nieser <hnsr@...all.nl>
To:	Francois Romieu <romieu@...zoreil.com>
Cc:	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Mass udp flow reboot linux with RealTek RTL-8169 Gigabit
On Wed, 2011-02-23 at 10:55 +0100, Francois Romieu wrote:
> Hans Nieser <hnsr@...all.nl> :
> [...]
> > With your patches applied to 2.6.38-rc6, I have gathered some of the
> > info you requested from Seblu as well, I hope it's helpful:
> > 
> > 1: see attachment
> 
> Ok.
> 
> The chipset requires no trivial last minute regression fix (yet).
> 
> > 2: I'm not sure how to check the size of the packets, but I'm just
> > fetching a (large) file over http/tcp, so I guess they are mostly of the
> > size of my MTU which is 1500 looking at ifconfig output
> 
> Fine.
> 
> Your testcases are always based on a real download, whence including some
> disk activity, as opposed to a pure network test, right ?
Yeah, I just had a little script that wgetted a file from a webserver in
my LAN and saved it to separate (non-root) fs, then removed it - in a
loop. When testing on the 2.6.35 and 2.6.35.9 kernels it did max out at
about 107MiB/s, sometimes falling down a little presumably when disk was
being touched.
> > For the other vmstat/ethtool/interrupts output, I started the following
> > commands remotely via ssh a second or two before starting the download,
> > and the machine locked up a few seconds later:
> 
> SysRq is enabled (/etc/sysctl.conf::kernel.sysrq = 1), the computer was
> switched back on a no-X console before the test. Then the keyboard leds
> ignore keypresses and the sysrq keys don't display anything in the
> console, right ?
Yep I had X shutdown and switched to VT1, after lock up the LEDs can't
be toggled anymore and sysrq key combo was nonresponsive (it works if I
do it before it locks up)
> You may enable PCIEASPM_DEBUG, force 'pcie_aspm=off' and switch from
> SLUB to SLAB but it's a bit cargo-cultish.
I'll give that a try this evening
> A bisection could help. Bisecting 2.6.35 .. 2.6.35.9 may be enough if
> 2.6.35.9 works well.
Hmm did you mean bisecting 2.6.36 - 2.6.35.9 ? Since with 2.6.36 and
above I can get the machine to hang within seconds and performance is
really bad (10-20MiB/s with wget), while with 2.6.35.9 and 2.6.35
performance was really good (reaching 107MiB/s most of the time) and
lock up took 5-10 minutes instead of seconds (I guess I didn't mention
this in my last e-mail but I managed to get both 2.6.35 and 2.6.35.9 to
lock up eventually) - but I guess something changed between .35 and .36
that made the issue easier to trigger.
I can also try even older kernels to see if there is one that doesn't
lock up at all
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
