lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1258099655.4039.998.camel@laptop>
Date:	Fri, 13 Nov 2009 09:07:35 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Lucas De Marchi <lucas.de.marchi@...il.com>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: perf stat output

On Thu, 2009-11-12 at 20:03 -0200, Lucas De Marchi wrote:
> Hi all!
> 
> Some questions about perf stat output. See example:
> 
> 
> lucas@...-linux:~/programming/testprograms> perf stat -e
> L1-dcache-loads -e L1-dcache-load-misses -- make -j
> gcc test_schedchanges.c -o test_schedchanges
> gcc -pthread test_taskaff1.c -o test_taskaff1
> gcc -pthread test_taskaff2.c -o test_taskaff2
> gcc -pthread test_taskaff3.c -o test_taskaff3
> 
>  Performance counter stats for 'make -j':
> 
>        161384667  L1-dcache-loads          #      0.000 M/sec
>               24853791  L1-dcache-load-misses    #      0.000 M/sec
> 
> 	          0.066893389  seconds time elapsed
> 
> Why do we have both L1-dcache-loads and  L1-dcache-load-misses with
> 0.000 M/sec? Also, why do we have 0 M/s when running "perf stat -a -e
> cache-misses -e cache-references" but values different than 0 when
> running "perf stat -a" without selecting the events?

No idea, you'd have to look at the code computing this M/sec stuff. I
think Ingo wrote that, so he might have an idea.

> The last question: what does the  "scaled from X%" mean? Is it related
> to the maximum number of performance registers a processor can count
> at a time?

Yes, if the hardware has only 2 counters and you specify 4, we'll
round-robin those 4 onto the 2. In that case you'll see things like
scaled from ~50% because each counter will only have been on the actual
PMU for about 50% of the time.

(RR happens with tick granularity, so if your runtime is of that order
or shorter you can get funny results with some counters being 0).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ