lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46CF7B13.3020701@psc.edu>
Date:	Fri, 24 Aug 2007 20:42:59 -0400
From:	John Heffner <jheffner@....edu>
To:	Bill Fink <billfink@...dspring.com>
CC:	Rick Jones <rick.jones2@...com>, hadi@...erus.ca,
	David Miller <davem@...emloft.net>, krkumar2@...ibm.com,
	gaagaan@...il.com, general@...ts.openfabrics.org,
	herbert@...dor.apana.org.au, jagana@...ibm.com, jeff@...zik.org,
	johnpol@....mipt.ru, kaber@...sh.net, mcarlson@...adcom.com,
	mchan@...adcom.com, netdev@...r.kernel.org,
	peter.p.waskiewicz.jr@...el.com, rdreier@...co.com,
	Robert.Olsson@...a.slu.se, shemminger@...ux-foundation.org,
	sri@...ibm.com, tgraf@...g.ch, xma@...ibm.com
Subject: Re: [PATCH 0/9 Rev3] Implement batching skb API and support in IPoIB

Bill Fink wrote:
> Here you can see there is a major difference in the TX CPU utilization
> (99 % with TSO disabled versus only 39 % with TSO enabled), although
> the TSO disabled case was able to squeeze out a little extra performance
> from its extra CPU utilization.  Interestingly, with TSO enabled, the
> receiver actually consumed more CPU than with TSO disabled, so I guess
> the receiver CPU saturation in that case (99 %) was what restricted
> its performance somewhat (this was consistent across a few test runs).


One possibility is that I think the receive-side processing tends to do 
better when receiving into an empty queue.  When the (non-TSO) sender is 
the flow's bottleneck, this is going to be the case.  But when you 
switch to TSO, the receiver becomes the bottleneck and you're always 
going to have to put the packets at the back of the receive queue.  This 
might help account for the reason why you have both lower throughput and 
higher CPU utilization -- there's a point of instability right where the 
receiver becomes the bottleneck and you end up pushing it over to the 
bad side. :)

Just a theory.  I'm honestly surprised this effect would be so 
significant.  What do the numbers from netstat -s look like in the two 
cases?

   -John
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ