[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091202094730.GC22654@elte.hu>
Date: Wed, 2 Dec 2009 10:47:30 +0100
From: Ingo Molnar <mingo@...e.hu>
To: "Ma, Ling" <ling.ma@...el.com>
Cc: Arjan van de Ven <arjan@...radead.org>,
Dave Jones <davej@...hat.com>, "hpa@...or.com" <hpa@...or.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] [X86] Compile Option Os versus O2 on latest x86
platform
* Ma, Ling <ling.ma@...el.com> wrote:
> Hi Ingo
>
> Thanks for your correction, so we use perf stat --repeat 3 to test
> volano, tbench, and kbuild, Because netperf has multiple items we may
> send out later.
>
> volano_Os:
> 18680627716893 cycles # 2925.196 M/sec ( +- 0.339% )
> 7247421283541 instructions # 0.388 IPC ( +- 0.124% )
> 226838591574 cache-references # 35.521 M/sec ( +- 0.971% )
> 9420427393 cache-misses # 1.475 M/sec ( +- 0.897% )
> volano_O2:
> 17145170491943 cycles # 2918.985 M/sec ( +- 0.288% )
> 7324126478801 instructions # 0.427 IPC ( +- 0.090% )
> 219064318074 cache-references # 37.296 M/sec ( +- 0.792% )
> 9491237013 cache-misses # 1.616 M/sec ( +- 0.439% )
> O2 is better than Os for volano
> O2 is not different with Os for tbench
> O2 is not different with Os for kbuild
Ok, this looks pretty credible, thanks for going through it.
For Volano, the difference is 8.9%, well above the 0.3% noise level, so
it's significant.
Would it be possible to do a 'perf record' and 'perf report' comparison
between two volano runs, to see where the nearly 10% overhead comes
from? It might be one or two functions mis-optimized by GCC perhaps. Or
it could be across-the-spectrum slowdown.
Note that the number of instructions increased only by 1%, but the
overhead by 9%. So we might be hitting some nasty corner case - or it
might be some caching effect. (which does not seem to be supported by
the numbers though - the LLC cache-misses does not look significantly
higher in the Os case)
'perf annotate fn_name' will also help you see where the overhead
hot-spots are. If you build the vmlinux via CONFIG_DEBUG_INFO the perf
annotate output will interleave assembly and source code output.
(otherwise it will be assembly output only)
You probably want to use the latest version of 'perf' for all that
analysis, from:
http://people.redhat.com/mingo/tip.git/README
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists