lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 23 Sep 2007 13:53:07 -0400
From:	jamal <hadi@...erus.ca>
To:	David Miller <davem@...emloft.net>
Cc:	krkumar2@...ibm.com, johnpol@....mipt.ru,
	herbert@...dor.apana.org.au, kaber@...sh.net,
	shemminger@...ux-foundation.org, jagana@...ibm.com,
	Robert.Olsson@...a.slu.se, rick.jones2@...com, xma@...ibm.com,
	gaagaan@...il.com, netdev@...r.kernel.org, rdreier@...co.com,
	peter.p.waskiewicz.jr@...el.com, mcarlson@...adcom.com,
	jeff@...zik.org, mchan@...adcom.com, general@...ts.openfabrics.org,
	kumarkr@...ux.ibm.com, tgraf@...g.ch, randy.dunlap@...cle.com,
	sri@...ibm.com
Subject: [PATCHES] TX batching


I had plenty of time this weekend so i have been doing a _lot_ of
testing.  My next emails will send a set of patches:
 
Patch 1: Introduces explicit tx locking
Patch 2: Introduces batching interface
Patch 3: Core uses batching interface
Patch 4: get rid of dev->gso_skb

Testing
-------
Each of these patches has been performance tested and the results
are in the logs on a per-patch basis. 
My system under test hardware is a 2xdual core opteron with a couple of 
tg3s. 
My test tool generates udp traffic of different sizes for upto 60 
seconds per run or a total of 30M packets. I have 4 threads each 
running on a specific CPU which keep all the CPUs as busy as they can 
sending packets targetted at a directly connected box's udp discard
port.

All 4 CPUs target a single tg3 to send. The receiving box has a tc rule 
which counts and drops all incoming udp packets to discard port - this
allows me to make sure that the receiver is not the bottleneck in the
testing. Packet sizes sent are {64B, 128B, 256B, 512B, 1024B}. Each
packet size run is repeated 10 times to ensure that there are no
transients. The average of all 10 runs is then computed and collected.

I have not run testing on patch #4 because i had to let the machine
go, but will have some access to it tommorow early morning where i can
run some tests.

Comments
--------
Iam trying to kill ->hard_batch_xmit() but it would be tricky to do
without it for LLTX drivers. Anything i try will require a few extra
checks. OTOH, I could kill LLTX for the drivers i am using that
are LLTX and then drop that interface or I could say "no support
for LLTX". I am in a dilema.

Dave please let me know if this meets your desires to allow devices
which are SG and able to compute CSUM benefit just in case i
misunderstood. 
Herbert, if you can look at at least patch 4 i will appreaciate it.

More patches to follow  - i didnt want to overload people by dumping 
too many patches. Most of these patches below are ready to go; some are
need some testing and others need a little porting from an earlier
kernel: 
- tg3 driver (tested and works well, but dont want to send 
- tun driver
- pktgen
- netiron driver
- e1000 driver
- ethtool interface
- There is at least one other driver promised to me

I am also going to update the two documents i posted earlier.
Hopefully i can do that today.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists