linux-kernel - Re: [PATCH 1/1] perf tools: Add missing user space support for config1/config2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BANLkTi=G7-v3ysxK2wY_3f8TecbD6ZjKog@mail.gmail.com>
Date:	Fri, 22 Apr 2011 11:41:03 +0200
From:	Stephane Eranian <eranian@...gle.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Arnaldo Carvalho de Melo <acme@...radead.org>,
	linux-kernel@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Lin Ming <ming.m.lin@...el.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>, eranian@...il.com,
	Arun Sharma <asharma@...com>
Subject: Re: [PATCH 1/1] perf tools: Add missing user space support for config1/config2

On Fri, Apr 22, 2011 at 11:23 AM, Ingo Molnar <mingo@...e.hu> wrote:
>
> * Stephane Eranian <eranian@...gle.com> wrote:
>
>> On Fri, Apr 22, 2011 at 10:06 AM, Ingo Molnar <mingo@...e.hu> wrote:
>> >
>> > * Ingo Molnar <mingo@...e.hu> wrote:
>> >
>> >> This needs to be a *lot* more user friendly. Users do not want to type in
>> >> stupid hexa magic numbers to get profiling. We have moved beyond the oprofile
>> >> era really.
>> >>
>> >> Unless there's proper generalized and human usable support i'm leaning
>> >> towards turning off the offcore user-space accessible raw bits for now, and
>> >> use them only kernel-internally, for the cache events.
>>
>> Generic cache events are a myth. They are not usable. I keep getting
>> questions from users because nobody knows what they are actually counting,
>> thus nobody knows how to interpret the counts. You cannot really hide the
>> micro-architecture if you want to make any sensible measurements.
>
> Well:
>
>  aldebaran:~> perf stat --repeat 10 -e instructions -e L1-dcache-loads -e L1-dcache-load-misses -e LLC-misses ./hackbench 10
>  Time: 0.125
>  Time: 0.136
>  Time: 0.180
>  Time: 0.103
>  Time: 0.097
>  Time: 0.125
>  Time: 0.104
>  Time: 0.125
>  Time: 0.114
>  Time: 0.158
>
>  Performance counter stats for './hackbench 10' (10 runs):
>
>     2,102,556,398 instructions             #      0.000 IPC     ( +-   1.179% )
>       843,957,634 L1-dcache-loads            ( +-   1.295% )
>       130,007,361 L1-dcache-load-misses      ( +-   3.281% )
>         6,328,938 LLC-misses                 ( +-   3.969% )
>
>        0.146160287  seconds time elapsed   ( +-   5.851% )
>
> It's certainly useful if you want to get ballpark figures about cache behavior
> of an app and want to do comparisons.
>
What can you conclude from the above counts?
Are they good or bad? If they are bad, how do you go about fixing the app?

> There are inconsistencies in our generic cache events - but that's not really a
> reason to obcure their usage behind nonsensical microarchitecture-specific
> details.
>
The actual events are a reflection of the micro-architecture. They indirectly
describe how it works. It is not clear to me that you can really improve your
app without some exposure to the micro-architecture.

So if you want to have generic events, I am fine with this, but you should not
block access to actual events pretending they are useless. Some people are
certainly interested in using them and learning about the micro-architecture
of their processor.


> But i'm definitely in favor of making these generalized events more consistent
> across different CPU types. Can you list examples of inconsistencies that we
> should resolve? (and which you possibly consider impossible to resolve, right?)
>
To make generic events more uniform across processors, one would have to have
precise definitions as to what they are supposed to count. Once you
have that, then
we may have a better chance at finding consistent mappings for each processor.
I have not yet seen such definitions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/