lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 5 Dec 2008 07:31:31 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Paul Mackerras <paulus@...ba.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>, linux-arch@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Stephane Eranian <eranian@...glemail.com>,
	Eric Dumazet <dada1@...mosbay.com>,
	Robert Richter <robert.richter@....com>,
	Arjan van de Veen <arjan@...radead.org>,
	Peter Anvin <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Steven Rostedt <rostedt@...dmis.org>,
	David Miller <davem@...emloft.net>
Subject: Re: [patch 0/3] [Announcement] Performance Counters for Linux


* Paul Mackerras <paulus@...ba.org> wrote:

> Thomas Gleixner writes:
> 
> > We'd like to announce a brand new implementation of performance counter
> > support for Linux. It is a very simple and extensible design that has the
> > potential to implement the full range of features we would expect from such
> > a subsystem.
> 
> Looks like the sort of thing I was thinking about a year or so ago when 
> I was trying to come up with a simpler API than perfmon2. However, it 
> turned out that my design, and I believe yours too, can't do some 
> things that users really want to do with performance counters.
> 
> One thing that this sort of thing can't do is to get values from 
> multiple counters that correlate with each other.  For instance, we 
> would often want to count, say, L2 cache misses and instructions 
> completed at the same time, and be able to read both counters at very 
> close to the same time, so that we can measure average L2 cache misses 
> per instruction completed, which is useful.

This can be done in a very natural way with our abstraction, and the 
"hello.c" example happens to do exactly that:

  aldebaran:~/perf-counter-test> ./hello
  doing perf_counter_open() call:
  counter[0]... fd: 3.
  counter[1]... fd: 4.
  counter[0] delta: 10866 cycles
  counter[1] delta: 414 cycles
  counter[0] delta: 23640 cycles
  counter[1] delta: 3673 cycles
  counter[0] delta: 28225 cycles
  counter[1] delta: 3695 cycles

This counts cycles executed and instructions executed, and reads the two 
counters out at the same time.

I just modified it to measure the exact example you mentioned above - L2 
cache misses and instructions completed, sampled once every second:

  titan:~/perf-counter-test> ./hello
  doing perf_counter_open() call:

  counter[0] delta: 1 cachemisses
  counter[1] delta: 497 instructions

  counter[0] delta: 14 cachemisses
  counter[1] delta: 4303 instructions

  counter[0] delta: 6 cachemisses
  counter[1] delta: 3666 instructions

  counter[0] delta: 2 cachemisses
  counter[1] delta: 3641 instructions

  counter[0] delta: 1 cachemisses
  counter[1] delta: 3641 instructions

It's a matter of:

        fd1 = perf_counter_open(PERF_COUNT_CACHE_MISSES, 0, 0, 0, -1);
        fd2 = perf_counter_open(PERF_COUNT_INSTRUCTIONS, 0, 0, 0, -1);

So it's very much possible. (If i've missed something about your example 
then please let me know.)

> Another problem is that this abstraction provides no way to deal with 
> interrelationships between counters.  For example, on PowerPC it is 
> common to have a facility where one counter overflowing can cause all 
> the other counters to freeze.  I don't see this abstraction providing 
> any way to handle that.

We could add that facility if it makes sense - there's no reason why 
there couldnt be event interaction between counters - we just went for 
the most common event variants in v1.

Btw., i'm curious, why would we want to do that? It skews the results if 
the task continues executing and counters stop. To get the highest 
quality profiling output the counters should follow the true state of the 
task that is profiled - and events should be passed to the monitoring 
task asynchronously. The _events_ can contain precise coupled information 
- but the counters should continue.

What i'd do is what hello.c does: if you want to read out multiple 
counters at once, you can read them out at once.

(Again, please explain in more detail if i have missed something about 
your observation.)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ