[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49763806.5090009@goop.org>
Date: Tue, 20 Jan 2009 12:45:58 -0800
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: Ingo Molnar <mingo@...e.hu>
CC: Nick Piggin <npiggin@...e.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, hpa@...or.com,
jeremy@...source.com, chrisw@...s-sol.org, zach@...are.com,
rusty@...tcorp.com.au, Andrew Morton <akpm@...ux-foundation.org>,
Xen-devel <xen-devel@...ts.xensource.com>
Subject: Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT
Ingo Molnar wrote:
> * Ingo Molnar <mingo@...e.hu> wrote:
>
>
>>> Times I believe are in nanoseconds for lmbench, anyway lower is
>>> better.
>>>
>>> non pv AVG=464.22 STD=5.56
>>> paravirt AVG=502.87 STD=7.36
>>>
>>> Nearly 10% performance drop here, which is quite a bit... hopefully
>>> people are testing the speed of their PV implementations against
>>> non-PV bare metal :)
>>>
>> Ouch, that looks unacceptably expensive. All the major distros turn
>> CONFIG_PARAVIRT on. paravirt_ops was introduced in x86 with the express
>> promise to have no measurable runtime overhead.
>>
>
> Here are some more precise stats done via hw counters on a perfcounters
> kernel using 'timec', running a modified version of the 'mmap performance
> stress-test' app i made years ago.
>
> The MM benchmark app can be downloaded from:
>
> http://redhat.com/~mingo/misc/mmap-perf.c
>
> timec.c can be picked up from:
>
> http://redhat.com/~mingo/perfcounters/timec.c
>
> mmap-perf conducts 1 million mmap()/munmap()/mremap() calls, and touches
> the mapped area as well with a certain chance. The patterns are
> pseudo-random and the random seed is initialized to the same value so
> repeated runs produce the exact same mmap sequence.
>
> I ran the test with a single thread and bound to a single core:
>
> # taskset 2 timec -e -5,-4,-3,0,1,2,3 ./mmap-perf 1
>
> [ I ran it as root - so that kernel-space hardware-counter statistics are
> included as well. ]
>
> The results are quite surprisingly candid about the true costs of
> paravirt_ops on the native kernel's overhead (CONFIG_PARAVIRT=y):
>
> -----------------------------------------------
> | Performance counter stats for './mmap-perf' |
> -----------------------------------------------
> | |
> | x86-defconfig | PARAVIRT=y
> |------------------------------------------------------------------
> |
> | 1311.554526 | 1360.624932 task clock ticks (msecs) +3.74%
> | |
> | 1 | 1 CPU migrations
> | 91 | 79 context switches
> | 55945 | 55943 pagefaults
> | ............................................
> | 3781392474 | 3918777174 CPU cycles +3.63%
> | 1957153827 | 2161280486 instructions +10.43%
>
!!
> | 50234816 | 51303520 cache references +2.12%
> | 5428258 | 5583728 cache misses +2.86%
>
Is this I or D, or combined?
> | |
> | 1314.782469 | 1363.694447 time elapsed (msecs) +3.72%
> | |
> -----------------------------------
>
> The most surprising element is that in the paravirt_ops case we run 204
> million more instructions - out of the ~2000 million instructions total.
>
> That's an increase of over 10%!
>
Yow! That's pretty awful. We knew that static instruction count was
up, but wouldn't have thought that it would hit the dynamic instruction
count so much...
I think there are some immediate tweaks we can make to the code
generated for each call site, which will help to an extent.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists