netdev - Re: [PATCH] poll: Avoid extra wakeups in select/poll

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 30 Apr 2009 12:49:00 +0200
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Christoph Lameter <cl@...ux.com>,
	linux kernel <linux-kernel@...r.kernel.org>,
	Andi Kleen <andi@...stfloor.org>,
	David Miller <davem@...emloft.net>, jesse.brandeburg@...el.com,
	netdev@...r.kernel.org, haoki@...hat.com, mchan@...adcom.com,
	davidel@...ilserver.org
Subject: Re: [PATCH] poll: Avoid extra wakeups in select/poll

Ingo Molnar a écrit :
> * Eric Dumazet <dada1@...mosbay.com> wrote:
> 
>> On uddpping, I had prior to the patch about 49000 wakeups per 
>> second, and after patch about 26000 wakeups per second (matches 
>> number of incoming udp messages per second)
> 
> very nice. It might not show up as a real performance difference if 
> the CPUs are not fully saturated during the test - but it could show 
> up as a decrease in CPU utilization.
> 
> Also, if you run the test via 'perf stat -a ./test.sh' you should 
> see a reduction in instructions executed:
> 
> aldebaran:~/linux/linux> perf stat -a sleep 1
> 
>  Performance counter stats for 'sleep':
> 
>    16128.045994  task clock ticks     (msecs)
>           12876  context switches     (events)
>             219  CPU migrations       (events)
>          186144  pagefaults           (events)
>     20911802763  CPU cycles           (events)
>     19309416815  instructions         (events)
>       199608554  cache references     (events)
>        19990754  cache misses         (events)
> 
>  Wall-clock time elapsed:  1008.882282 msecs
> 
> With -a it's measured system-wide, from start of test to end of test 
> - the results will be a lot more stable (and relevant) statistically 
> than wall-clock time or CPU usage measurements. (both of which are 
> rather imprecise in general)

I tried this perf stuff and got strange results on a cpu burning bench, 
saturating my 8 cpus with a "while (1) ;" loop


# perf stat -a sleep 10

 Performance counter stats for 'sleep':

   80334.709038  task clock ticks     (msecs)
          80638  context switches     (events)
              4  CPU migrations       (events)
            468  pagefaults           (events)
   160694681969  CPU cycles           (events)
   160127154810  instructions         (events)
         686393  cache references     (events)
         230117  cache misses         (events)

 Wall-clock time elapsed: 10041.531644 msecs

So its about 16069468196 cycles per second for 8 cpus
Divide by 8 to get 2008683524 cycles per second per cpu,
which is not       3000000000  (E5450  @ 3.00GHz)

It seems strange a "jmp myself" uses one unhalted cycle per instruction 
and 0.5 halted cycle ...

Also, after using "perf stat", tbench results are 1778 MB/S
instead of 2610 MB/s. Even if no perf stat running.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html