[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49EF85E3.8060703@cosmosbay.com>
Date: Wed, 22 Apr 2009 23:02:27 +0200
From: Eric Dumazet <dada1@...mosbay.com>
To: Christoph Lameter <cl@...ux.com>
CC: David Miller <davem@...emloft.net>,
Michael Chan <mchan@...adcom.com>,
Ben Hutchings <bhutchings@...arflare.com>,
netdev@...r.kernel.org
Subject: Re: udp ping pong measurements from 2.6.22 to .30 with various cpu
affinities
Christoph Lameter a écrit :
> Here are the results of udp ping pong tests. Tests were done with between
> two machines. The first box was running a 2.6.22 kernel with the nic IRQ
> and udpping pinned to processor 4.
>
> The second box ran the various kernel versions. NIC irq pinned to cpu 4.
> Then the pinning of udpping (see gentwo.org/ll) was varied
>
> 1. Pinned to the same processor (cpu4)
> 2. Pinned to a processor that shares the L2 cache (cpu5)
> 3. Pinned to a processor not sharing L2 (cpu6)
Here on my dev machine, cpu0, cpu2, cpu4, cpu6 are on physical CPU 0
and cpu1, cpu3, cpu5, cpu7 on physical CPU 1
egrep "processor|core id|physical" /proc/cpuinfo
processor : 0
physical id : 0
core id : 0
processor : 1
physical id : 1
core id : 0
processor : 2
physical id : 0
core id : 2
processor : 3
physical id : 1
core id : 2
processor : 4
physical id : 0
core id : 1
processor : 5
physical id : 1
core id : 1
processor : 6
physical id : 0
core id : 3
processor : 7
physical id : 1
core id : 3
Check /proc/cpuinfo, and check it doesnt change between kernel version.
>
> Results follow (a nice diagram is available from
> http://gentwo.org/results/udpping-tests-2.pdf
Nice graphs, but lack of documentation of test conditions.
>
> Observations:
> - Pinning to the same cpu yields almost 8 usecs vs. another cpu sharing
> the same L2.
> - Strangely the cpu not sharing the l2 is better than a cpu with the same
> L2.
When I see strange results like that, I ask to myself :
Is the problem located at the looked-at system, or at the observer ?
> - Regression with cpu on the same cpu as the interrupt is about 1.5 usecs
> - Improvement with cpu on the same l2 cache is improved.
> - Regression of 1 usec if cpu is not sharing l2.
>
> Hmmm... This could all be a scheduling problem. If the processes are not
> placed where the IRQ occurs then there will be a significant disadvantage.
>
We already pointed it was probably scheduling. Since ICMP pings dont use
processes and no regression here. Patching kernel to implement udpping
at softirq level should be quite easy if you really want to check UDP stack.
Last network improvements focused on scalability more than latencies.
(multi-flows, not single flow !)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists