netdev - Re: [PATCH 0/9 Rev3] Implement batching skb API and support in IPoIB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 24 Aug 2007 20:42:59 -0400
From:	John Heffner <jheffner@....edu>
To:	Bill Fink <billfink@...dspring.com>
CC:	Rick Jones <rick.jones2@...com>, hadi@...erus.ca,
	David Miller <davem@...emloft.net>, krkumar2@...ibm.com,
	gaagaan@...il.com, general@...ts.openfabrics.org,
	herbert@...dor.apana.org.au, jagana@...ibm.com, jeff@...zik.org,
	johnpol@....mipt.ru, kaber@...sh.net, mcarlson@...adcom.com,
	mchan@...adcom.com, netdev@...r.kernel.org,
	peter.p.waskiewicz.jr@...el.com, rdreier@...co.com,
	Robert.Olsson@...a.slu.se, shemminger@...ux-foundation.org,
	sri@...ibm.com, tgraf@...g.ch, xma@...ibm.com
Subject: Re: [PATCH 0/9 Rev3] Implement batching skb API and support in IPoIB

Bill Fink wrote:
> Here you can see there is a major difference in the TX CPU utilization
> (99 % with TSO disabled versus only 39 % with TSO enabled), although
> the TSO disabled case was able to squeeze out a little extra performance
> from its extra CPU utilization.  Interestingly, with TSO enabled, the
> receiver actually consumed more CPU than with TSO disabled, so I guess
> the receiver CPU saturation in that case (99 %) was what restricted
> its performance somewhat (this was consistent across a few test runs).

One possibility is that I think the receive-side processing tends to do 
better when receiving into an empty queue.  When the (non-TSO) sender is 
the flow's bottleneck, this is going to be the case.  But when you 
switch to TSO, the receiver becomes the bottleneck and you're always 
going to have to put the packets at the back of the receive queue.  This 
might help account for the reason why you have both lower throughput and 
higher CPU utilization -- there's a point of instability right where the 
receiver becomes the bottleneck and you end up pushing it over to the 
bad side. :)

Just a theory.  I'm honestly surprised this effect would be so 
significant.  What do the numbers from netstat -s look like in the two 
cases?

   -John
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html