lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <4B50B032.2060609@hp.com> Date: Fri, 15 Jan 2010 10:13:06 -0800 From: Rick Jones <rick.jones2@...com> To: Krishna Kumar <krkumar2@...ibm.com> CC: davem@...emloft.net, ilpo.jarvinen@...sinki.fi, netdev@...r.kernel.org, eric.dumazet@...il.com Subject: Re: [RFC] [PATCH] Optimize TCP sendmsg in favour of fast devices? Krishna Kumar wrote: > From: Krishna Kumar <krkumar2@...ibm.com> > > Remove inline skb data in tcp_sendmsg(). For the few devices that > don't support NETIF_F_SG, dev_queue_xmit will call skb_linearize, > and pass the penalty to those slow devices (the following drivers > do not support NETIF_F_SG: 8139cp.c, amd8111e.c, dl2k.c, dm9000.c, > dnet.c, ethoc.c, ibmveth.c, ioc3-eth.c, macb.c, ps3_gelic_net.c, > r8169.c, rionet.c, spider_net.c, tsi108_eth.c, veth.c, > via-velocity.c, atlx/atl2.c, bonding/bond_main.c, can/dev.c, > cris/eth_v10.c). > > This patch does not affect devices that support SG but turn off > via ethtool after register_netdev. > > I ran the following test cases with iperf - #threads: 1 4 8 16 32 > 64 128 192 256, I/O sizes: 256 4K 16K 64K, each test case runs for > 1 minute, repeat 5 iterations. Total test run time is 6 hours. > System is 4-proc Opteron, with a Chelsio 10gbps NIC. Results (BW > figures are the aggregate across 5 iterations in mbps): > > ------------------------------------------------------- > #Process I/O Size Org-BW New-BW %-change > ------------------------------------------------------- > 1 256 2098 2147 2.33 > 1 4K 14057 14269 1.50 > 1 16K 25984 27317 5.13 > 1 64K 25920 27539 6.24 > ... > 256 256 1947 1955 0.41 > 256 4K 9828 12265 24.79 > 256 16K 25087 24977 -0.43 > 256 64K 26715 27997 4.79 > ------------------------------------------------------- > Total: - 600071 634906 5.80 > ------------------------------------------------------- Does bandwidth alone convey the magnitude of the change? I would think that would only be the case if the CPU(s) were 100% utilized, and perhaps not even completely then. At the risk of a shameless plug, it's not for nothing that netperf reports service demand :) I would think that change in service demand (CPU per unit of work) would be something one wants to see. Also, the world does not run on bandwidth alone, so small packet performance and any delta there would be good to have. Multiple process tests may not be as easy in netperf as it is in iperf, but under: ftp://ftp.netperf.org/netperf/misc I have a single-stream test script I use called runemomni.sh and an example of its output, as well as an aggregate script I use called runemomniagg2.sh - I'll post an example of its output there as soon as I finish some runs. The script presumes one has ./configure'd netperf: ./configure --enable-burst --enable-omni ... The netperf omni tests still ass-u-me that the CPU util each measures is all his own, which means the service demands from aggrgate tests require some post-processing fixup. http://www.netperf.org/svn/netperf2/trunk/doc/netperf.html#Using-Netperf-to-Measure-Aggregate-Performance happy benchmarking, rick jones FWIW, service demand and pps performance may be even more important for non-SG devices because they may be slow 1 Gig devices and still hit link-rate on a bulk throughput test even with a non-trivial increase in CPU util. However, a non-trivial hit in CPU util may rather change the pps performance. PPS - there is a *lot* of output in those omni test results - best viewed with a spreadsheet program. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists