[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49F9821C.5010802@cosmosbay.com>
Date: Thu, 30 Apr 2009 12:49:00 +0200
From: Eric Dumazet <dada1@...mosbay.com>
To: Ingo Molnar <mingo@...e.hu>
CC: Christoph Lameter <cl@...ux.com>,
linux kernel <linux-kernel@...r.kernel.org>,
Andi Kleen <andi@...stfloor.org>,
David Miller <davem@...emloft.net>, jesse.brandeburg@...el.com,
netdev@...r.kernel.org, haoki@...hat.com, mchan@...adcom.com,
davidel@...ilserver.org
Subject: Re: [PATCH] poll: Avoid extra wakeups in select/poll
Ingo Molnar a écrit :
> * Eric Dumazet <dada1@...mosbay.com> wrote:
>
>> On uddpping, I had prior to the patch about 49000 wakeups per
>> second, and after patch about 26000 wakeups per second (matches
>> number of incoming udp messages per second)
>
> very nice. It might not show up as a real performance difference if
> the CPUs are not fully saturated during the test - but it could show
> up as a decrease in CPU utilization.
>
> Also, if you run the test via 'perf stat -a ./test.sh' you should
> see a reduction in instructions executed:
>
> aldebaran:~/linux/linux> perf stat -a sleep 1
>
> Performance counter stats for 'sleep':
>
> 16128.045994 task clock ticks (msecs)
> 12876 context switches (events)
> 219 CPU migrations (events)
> 186144 pagefaults (events)
> 20911802763 CPU cycles (events)
> 19309416815 instructions (events)
> 199608554 cache references (events)
> 19990754 cache misses (events)
>
> Wall-clock time elapsed: 1008.882282 msecs
>
> With -a it's measured system-wide, from start of test to end of test
> - the results will be a lot more stable (and relevant) statistically
> than wall-clock time or CPU usage measurements. (both of which are
> rather imprecise in general)
I tried this perf stuff and got strange results on a cpu burning bench,
saturating my 8 cpus with a "while (1) ;" loop
# perf stat -a sleep 10
Performance counter stats for 'sleep':
80334.709038 task clock ticks (msecs)
80638 context switches (events)
4 CPU migrations (events)
468 pagefaults (events)
160694681969 CPU cycles (events)
160127154810 instructions (events)
686393 cache references (events)
230117 cache misses (events)
Wall-clock time elapsed: 10041.531644 msecs
So its about 16069468196 cycles per second for 8 cpus
Divide by 8 to get 2008683524 cycles per second per cpu,
which is not 3000000000 (E5450 @ 3.00GHz)
It seems strange a "jmp myself" uses one unhalted cycle per instruction
and 0.5 halted cycle ...
Also, after using "perf stat", tbench results are 1778 MB/S
instead of 2610 MB/s. Even if no perf stat running.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists