lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 08 Aug 2012 18:44:27 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Paul Gortmaker <paul.gortmaker@...driver.com>
Cc:	Claudiu Manoil <claudiu.manoil@...escale.com>,
	netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>
Subject: Re: [RFC net-next 0/4] gianfar: Use separate NAPI for Tx
 confirmation processing

On Wed, 2012-08-08 at 12:24 -0400, Paul Gortmaker wrote:
> [[RFC net-next 0/4] gianfar: Use separate NAPI for Tx confirmation processing] On 08/08/2012 (Wed 15:26) Claudiu Manoil wrote:
> 
> > Hi all,
> > This set of patches basically splits the existing napi poll routine into
> > two separate napi functions, one for Rx processing (triggered by frame
> > receive interrupts only) and one for the Tx confirmation path processing
> > (triggerred by Tx confirmation interrupts only). The polling algorithm
> > behind remains much the same.
> > 
> > Important throughput improvements have been noted on low power boards with
> > this set of changes.
> > For instance, for the following netperf test:
> > netperf -l 20 -cC -H 192.168.10.1 -t TCP_STREAM -- -m 1500
> > yields a throughput gain from oscilating ~500-~700 Mbps to steady ~940 Mbps,
> > (if the Rx/Tx paths are processed on different cores), w/ no increase in CPU%,
> > on a p1020rdb - 2 core machine featuring etsec2.0 (Multi-Queue Multi-Group
> > driver mode).
> 
> It would be interesting to know more about what was causing that large
> an oscillation -- presumably you will have it reappear once one core
> becomes 100% utilized.  Also, any thoughts on how the change will change
> performance on an older low power single core gianfar system (e.g.  83xx)?

I also was wondering if this low performance could be caused by BQL

Since TCP stack is driven by incoming ACKS, a NAPI run could have to
handle 10 TCP acks in a row, and resulting xmits could hit BQL and
transit on qdisc (Because NAPI handler wont handle TX completions in the
middle of RX handler)

So experiments would be nice, maybe by reducing a
bit /proc/sys/net/ipv4/tcp_limit_output_bytes 
(from 131072 to 65536 or 32768)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ