lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 23 Feb 2011 19:31:33 +0100 From: Hans Nieser <hnsr@...all.nl> To: Francois Romieu <romieu@...zoreil.com> Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: Mass udp flow reboot linux with RealTek RTL-8169 Gigabit On Wed, 2011-02-23 at 13:21 +0100, Hans Nieser wrote: > On Wed, 2011-02-23 at 10:55 +0100, Francois Romieu wrote: > > Hans Nieser <hnsr@...all.nl> : > > [...] > > > With your patches applied to 2.6.38-rc6, I have gathered some of the > > > info you requested from Seblu as well, I hope it's helpful: > > > > > > 1: see attachment > > > > Ok. > > > > The chipset requires no trivial last minute regression fix (yet). > > > > > 2: I'm not sure how to check the size of the packets, but I'm just > > > fetching a (large) file over http/tcp, so I guess they are mostly of the > > > size of my MTU which is 1500 looking at ifconfig output > > > > Fine. > > > > Your testcases are always based on a real download, whence including some > > disk activity, as opposed to a pure network test, right ? > > Yeah, I just had a little script that wgetted a file from a webserver in > my LAN and saved it to separate (non-root) fs, then removed it - in a > loop. When testing on the 2.6.35 and 2.6.35.9 kernels it did max out at > about 107MiB/s, sometimes falling down a little presumably when disk was > being touched. > > > > For the other vmstat/ethtool/interrupts output, I started the following > > > commands remotely via ssh a second or two before starting the download, > > > and the machine locked up a few seconds later: > > > > SysRq is enabled (/etc/sysctl.conf::kernel.sysrq = 1), the computer was > > switched back on a no-X console before the test. Then the keyboard leds > > ignore keypresses and the sysrq keys don't display anything in the > > console, right ? > > Yep I had X shutdown and switched to VT1, after lock up the LEDs can't > be toggled anymore and sysrq key combo was nonresponsive (it works if I > do it before it locks up) > > > You may enable PCIEASPM_DEBUG, force 'pcie_aspm=off' and switch from > > SLUB to SLAB but it's a bit cargo-cultish. > > I'll give that a try this evening > > > A bisection could help. Bisecting 2.6.35 .. 2.6.35.9 may be enough if > > 2.6.35.9 works well. > > Hmm did you mean bisecting 2.6.36 - 2.6.35.9 ? Since with 2.6.36 and > above I can get the machine to hang within seconds and performance is > really bad (10-20MiB/s with wget), while with 2.6.35.9 and 2.6.35 > performance was really good (reaching 107MiB/s most of the time) and > lock up took 5-10 minutes instead of seconds (I guess I didn't mention > this in my last e-mail but I managed to get both 2.6.35 and 2.6.35.9 to > lock up eventually) - but I guess something changed between .35 and .36 > that made the issue easier to trigger. > > I can also try even older kernels to see if there is one that doesn't > lock up at all > Ok, I just tried 2.6.34, and after over 5 hours of running my script, the system is still up and running, with only 24 'link up' messages on dmesg, and having transferred 2.1TiB of data (1428042421 rx_packets, 45 rx_missed). So I'm going to assume the problem isn't present with this kernel and try a bisect between it and 2.6.35 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists