netdev - RE: Can I limit the number of active tx per TCP socket?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <063D6719AE5E284EB5DD2968C1650D6D0F6D755C@AcuExch.aculab.com>
Date:	Fri, 7 Mar 2014 12:29:00 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	David Laight <David.Laight@...LAB.COM>,
	'Neal Cardwell' <ncardwell@...gle.com>,
	Rick Jones <rick.jones2@...com>
CC:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: Can I limit the number of active tx per TCP socket?

From: David Laight
> From: Neal Cardwell
> > Eric's recent "auto corking" feature may be helpful in this context:
> >
> >   http://lwn.net/Articles/576263/
> 
> Yes, I was running some tests to make sure this didn't cause us
> any grief.
> 
> In fact very aggressive "auto corking" would help my traffic flow.
> I haven't yet tried locally reverting the patch that stopped
> auto corking being quite as effective.
> (I might even try setting the limit lower than 2*skb_true_size.)

Agressive auto corking helps (reduces the cpu load of the ppc from
60% to 30% for the same traffic flow) - but only if I reduce the
ethernet speed to 10M.
(Actually I only have the process cpu load for the ppc, it probably
excludes a lot of the tcp rx code.)

I guess that the transmit is completing before the other active
processes/kernel threads manage to request the next transmit.
This won't be helped by the first packet being short.

Spinning all but one of the cpus with 'while :; do :; done'
has an interesting effect on the workload/throughput.
The aggregate message rate goes from 5200/sec to 9000/sec (now
limited by the 64k links).
I think this happens because the scheduler 'resists' pre-empting
running processes - so the TCP send processing happens in bursts.
The ppc process is then using about 85% (from top).

I'll probably look at delaying the sends within our own code.

	David