lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 15 Nov 2009 10:13:43 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Lucas De Marchi <lucas.de.marchi@...il.com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: perf stat output


* Lucas De Marchi <lucas.de.marchi@...il.com> wrote:

> Hi all!
> 
> Some questions about perf stat output. See example:
> 
> 
> lucas@...-linux:~/programming/testprograms> perf stat -e
> L1-dcache-loads -e L1-dcache-load-misses -- make -j
> gcc test_schedchanges.c -o test_schedchanges
> gcc -pthread test_taskaff1.c -o test_taskaff1
> gcc -pthread test_taskaff2.c -o test_taskaff2
> gcc -pthread test_taskaff3.c -o test_taskaff3
> 
>  Performance counter stats for 'make -j':
> 
>        161384667  L1-dcache-loads          #      0.000 M/sec
>               24853791  L1-dcache-load-misses    #      0.000 M/sec
> 
> 	          0.066893389  seconds time elapsed
> 
> Why do we have both L1-dcache-loads and L1-dcache-load-misses with 
> 0.000 M/sec? Also, why do we have 0 M/s when running "perf stat -a -e 
> cache-misses -e cache-references" but values different than 0 when 
> running "perf stat -a" without selecting the events?

You need the 'task-clock' event to be able to see M/sec metrics. I.e.:

$ perf stat -e L1-dcache-loads -e L1-dcache-load-misses -e task-clock sleep 1

 Performance counter stats for 'sleep 1':

         201330  L1-dcache-loads          #    566.234 M/sec
          29916  L1-dcache-load-misses    #     84.138 M/sec
       0.355560  task-clock-msecs         #      0.000 CPUs 

    1.000621650  seconds time elapsed

I agree with you that seeing '0.000 M/sec' is confusing and incorrect as 
well. One solution would be to skip the printout in that case.

You can find the latest 'perf' code in:

  http://people.redhat.com/mingo/tip.git/README

( the tools/perf/ bits are backwards compatible with any perf kernel you 
  are running right now, so no reboot is needed. )

You can find the stats printing in tools/perf/builtin-stat.c, in the 
abs_printout() function:

        } else {
                total = avg_stats(&runtime_nsecs_stats);

                if (total)
                        ratio = 1000.0 * avg / total;

                fprintf(stderr, " # %10.3f M/sec", ratio);

I think if runtime_nsecs_stats is zero (i.e. if no 'task-clock' events 
were measured), then we might be able to skip the printout via doing 
something like:

        } else if (runtime_nsecs_stats.n != 0) {

Would you be interested in sending a (tested) patch for that? In theory 
only that oneliner change should suffice - but i have not tested it.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ