lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Sep 2009 19:26:18 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Christoph Lameter <cl@...ux-foundation.org>
CC:	netdev@...r.kernel.org
Subject: Re: UDP regression with packets rates < 10k per sec

Christoph Lameter a écrit :
> On Tue, 15 Sep 2009, Eric Dumazet wrote:
> 
>> 2.6.31 is actually faster than 2.6.22 on the bench you provided.
> 
> Well at high packet rates which were not the topic.
> 
>> Must be specific to the hardware I guess ?
> 
> Huh? Even your loopback numbers did show the regression up to 10k.
> 
>> As text size presumably is bigger in 2.6.31, fetching code
>> in cpu caches to handle 10 packets per second is what we call
>> a cold path anyway.
> 
> Ok so its an accepted regression? This is a significant reason not to use
> newer versions of kernels for latency critical applications that may have
> to send a packet once in a while for notification. The latency is doubled
> (1G) / tripled / quadrupled (IB) vs 2.6.22.
> 
>> If you want to make it a fast path, you want to make sure code its
>> always hot in cpu caches, and find a way to inject packets into
>> the kernel to make sure cpu keep the path hot.
> 
> Oh, gosh.

It seems there is a lot of confusion on this topic, so I will make a full recap :

Once I understood my 2.6.31 kernel had much more features than 2.6.22 and that I tuned
it to :

- Let cpu run at full speed (3GHz instead of 2GHz) : before tuning, 2.6.31 was 
using "ondemand" governor and my cpus were running at 2GHz, while they where
running at 3GHz on my 2.6.22 config

- Dont let cpus enter C2/C3 wait states (idle=mwait)

- Correctly affine cpu to ethX irq (2.6.22 was running ethX irq on one cpu, while
 on 2.6.31, irqs were distributed to all online cpus)


Then, your mcast test gives same results, at 10pps, 100pps, 1000pps, 10000pps

When sniffing receiving side, I can notice :

- Answer to an icmp ping (served by softirq only) : 6 us between request and reply

- Answer to one 'give timestamp' request from mcast client : 11 us betwen request and reply,
  regardless of kernel version (2.6.22 or 2.6.31)

So there is a 5us cost to actually wakeup a process and let him do the recvfrom() and sendto() pair,
which is quite OK, and this time was not significantly changed between 2.6.22 and 2.6.31

Hope this helps
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ