lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 18 Apr 2012 10:30:12 -0700
From:	Rick Jones <rick.jones2@...com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next] tcp: avoid expensive pskb_expand_head() calls

On 04/18/2012 10:16 AM, Eric Dumazet wrote:
> On Wed, 2012-04-18 at 10:00 -0700, Rick Jones wrote:
>
>> Is the issue completely sent, or transmit completion processed?  I'd
>> think it is time to the latter that matters (and includes the former) yes?
>>
>
> I dont know. Fact is we process ACKs before clone skb is freed by TX
> completion.
>
>> Does the ixgbe driver do transmit completions first when it gets a
>> receive interrupt, or is there still the chance that the receipt of the
>> last ACK for the 64KB skb will hit TCP before the driver has done the
>> free?  (Or does that not matter?)
>
> It does transmit completions first, but that doesnt matter, since we
> receive ACK before skb could be drained by NIC and returned to driver
> for TX completion.

I was thinking more about the race if any between the ACK for the last 
byte of the 64 KB skb and the transmit completion processing freeing it 
in the driver.  But that may be moot.


>>> Performance results on my Q6600 cpu and 82599EB 10-Gigabit card :
>>> About 3% less cpu used for same workload (single netperf TCP_STREAM),
>>> bounded by x4 PCI-e slots (4660 Mbits).
>>
>> Three percent less or three percentage points less?  Including the
>> details of the netperf-reported service demand would make that clear.
>
> netperf results are not precise enough, since my setup is limited by PCI
> bandwidth. here are the "perf stat" ones

I'm confused -  Netperf's CPU utilization measurements (-c -C) and by 
extension service demand calculation should be able to see an overall 
three percentage point change in CPU util, even a three percent one.

>
> Maybe someone can run the test on 20Gb/40Gb links, and NUMA machine.
>
> Before patch :
>
> # perf stat -r 5 -d -d -o RES.before taskset 1 netperf -H 192.168.99.1 -l 20

I'm still learning about perf, and the manpage I have for it does not 
discuss the -d option but is that doing system wide, or only in the 
context of the netperf process?

rick
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ