lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <1181218576.4064.40.camel@localhost> Date: Thu, 07 Jun 2007 08:16:16 -0400 From: jamal <hadi@...erus.ca> To: Krishna Kumar2 <krkumar2@...ibm.com> Cc: Gagan Arneja <gaagaan@...il.com>, Evgeniy Polyakov <johnpol@....mipt.ru>, netdev@...r.kernel.org, Rick Jones <rick.jones2@...com>, Sridhar Samudrala <sri@...ibm.com>, David Miller <davem@...emloft.net>, Robert Olsson <Robert.Olsson@...a.slu.se> Subject: Re: [WIP][PATCHES] Network xmit batching KK, On Thu, 2007-07-06 at 14:12 +0530, Krishna Kumar2 wrote: > I have run only once instead of > taking any averages, so there could be some spurts/drops. Would be nice to run three sets - but i think even one would be sufficiently revealing. > These results are based on the test script that I sent earlier today. I > removed the results for UDP 32 procs 512 and 4096 buffer cases since > the BW was coming >line speed (infact it was showing 1500Mb/s and > 4900Mb/s respectively for both the ORG and these bits). I expect UDP to overwhelm the receiver. So the receiver needs a lot more tuning (like increased rcv socket buffer sizes to keep up, IMO). But yes, the above is an odd result - Rick any insight into this? > I am not sure > how it is coming this high, but netperf4 is the only way to correctly > measure multiple process combined BW. Another thing to do is to disable > pure performance fixes in E1000 (eg changing THRESHOLD to 128 and > some other changes like Erratum workaround or MSI, etc) which are > independent of this functionality. Then a more accurate performance > result is possible when comparing org vs batch code, without mixing in > unrelated performance fixes which skews the results (either positively > or negatively :). > I agree that THRESHOLD change needs to be the same for a fair comparison. Note however, it is definetely a tuning parameter which is a fundamental aspect of this batching exercise (historically this was added to e1000 because i found it useful in my 2006 batch experiments). When all the dust settles we should be able to pick a value that is optimal. Would it be useful if i made this a boot/module parameter? It should have been actually. The erratum changes - I am not so sure; the ->prep_xmit() is a fundamental aspect and it needs to run lockless; the erratum forces us to run with a lock. In any case, I dont think that affects your chip. > Each iteration consists of running buffer sizes 8, 32, 128, 512, 4096. It seems to me any runs with buffer less than 512B are unable to fill the pipe - so will not really benefit (will probably do with nagling). However, the < 512 B should show equivalent results before and after the changes. You can try to turn off _BTX feature in the driver and see if they are the same. If they are not, then the suspect change will be easy to find. When i turned off the _BTX changes i saw no difference with pktgen. But that is a different code path. > Summary : Average BW (whatever meaning that has) improved 0.65%, while > Service Demand deteriorated 11.86% Sorry, been many moons since i last played with netperf; what does "service demand" mean? cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists