netdev - Re: UDP regression with packets rates

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 15 Sep 2009 21:02:15 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Christoph Lameter <cl@...ux-foundation.org>
CC:	netdev@...r.kernel.org
Subject: Re: UDP regression with packets rates < 10k per sec

Christoph Lameter a écrit :
> On Tue, 15 Sep 2009, Eric Dumazet wrote:
> 
>> Once I understood my 2.6.31 kernel had much more features than 2.6.22 and that I tuned
>> it to :
>>
>> - Let cpu run at full speed (3GHz instead of 2GHz) : before tuning, 2.6.31 was
>> using "ondemand" governor and my cpus were running at 2GHz, while they where
>> running at 3GHz on my 2.6.22 config
> 
> My kernel did not have support for any governors compiled in.
> 
>> - Dont let cpus enter C2/C3 wait states (idle=mwait)
> 
> Ok. Trying idle=mwait.
> 
>> - Correctly affine cpu to ethX irq (2.6.22 was running ethX irq on one cpu, while
>>  on 2.6.31, irqs were distributed to all online cpus)
> 
> Interrupts of both 2.6.22 and 2.6.31 go to cpu 0. Does it matter for
> loopback?

No of course, loopback triggers softirq on the local cpu, no special setup
to respect.

> 
>> Then, your mcast test gives same results, at 10pps, 100pps, 1000pps, 10000pps
> 
> loopback via mcast -Ln1 -r <rate>
> 
> 		10pps	100pps	1000pps	10000pps
> 2.6.22(32bit)	7.36	7.28	7.15	7.16
> 2.6.31(64bit)	9.28	10.27	9.70	9.79
> 
> What a difference. Now the initial latency rampup for 2.6.31 is gone. So
> even w/o governors the kernel does something to increase the latencies.
> 
> We sacrificed 2 - 3 microseconds per message to kernel features, bloat and
> 64 bitness?

Well, I dont know, I mainly use 32bits kernel, but yes, using 64bits has a cost,
since skbs for example are bigger, sockets are bigger, so we touch more cache lines
per transaction...


You could precisely compute number of cycles per transaction, with "perf" tools
(only on 2.6.31), between 64bit and 32bit kernels, benching 100000 pps for example
and counting number of perf counter irqs per second
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html