lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 14 Jan 2008 17:56:28 +0100 From: Eric Dumazet <dada1@...mosbay.com> To: Chris Friesen <cfriesen@...tel.com> Cc: Ray Lee <ray-lk@...rabbit.org>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: questions on NAPI processing latency and dropped network packets Chris Friesen a écrit : > Ray Lee wrote: >> On Jan 10, 2008 9:24 AM, Chris Friesen <cfriesen@...tel.com> wrote: > >>> After a recent userspace app change, we've started seeing packets being >>> dropped by the ethernet hardware (e1000, NAPI is enabled). The >>> error/dropped/fifo counts are going up in ethtool: > >> Can you reproduce it with a simple userspace cpu hog? (Two, really, >> one per cpu.) >> Can you reproduce it with the newer e1000? > > Hmm...good questions and I haven't checked either. The first one is > relatively straightforward. The second is a bit trickier...last time > I tried the latest e1000 driver the card wouldn't boot (we use netboot). > >> Can you reproduce it with git head? > > Unfortunately, I don't think I'll be able to try this. We require > kernel mods for our userspace to run, and I doubt I'd be able to get > the time to port all the changes forward to git head. > >> If the answer to the first one is yes, the last no, then bisect until >> you get a kernel that doesn't show the problem. Backport the fix, >> unless the fix happens to be CFS. However, I suspect that your >> userpace app is just starving the system from time to time. > > It's conceivable that userspace is starving the kernel, but we have do > about 45% idle on one cpu, and 7-10% idle on the other. > > We also have an odd situation where on an initial test run after > bootup we have 18-24% idle on cpu1, but resetting the test tool drops > that to the 7-10% I mentioned above. > > Based on profiling and instrumentation it seems like the cost of > sctp_endpoint_lookup_assoc() more than triples, which means that the > amount of time that bottom halves are disabled in that function also > triples. Any idea of the size of sctp hash size you have ? (your dmesg probably includes a message starting with SCTP: Hash tables configured... How many concurrent sctp sockets are handled ? Maybe sctp_assoc_hashfn() is too weak for your use, and some chains are *really* long. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists