[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A849E14.7030007@linux.vnet.ibm.com>
Date: Thu, 13 Aug 2009 16:13:24 -0700
From: Corey Ashford <cjashfor@...ux.vnet.ibm.com>
To: eranian@...il.com
CC: Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Robert Richter <robert.richter@....com>,
Paul Mackerras <paulus@...ba.org>,
Andi Kleen <andi@...stfloor.org>,
Maynard Johnson <mpjohn@...ibm.com>,
Carl Love <cel@...ibm.com>,
Corey J Ashford <cjashfor@...ibm.com>,
Philip Mucci <mucci@...s.utk.edu>,
Dan Terpstra <terpstra@...s.utk.edu>,
perfmon2-devel <perfmon2-devel@...ts.sourceforge.net>
Subject: Re: perf_counters issue with PERF_SAMPLE_GROUP
stephane eranian wrote:
> On Thu, Aug 13, 2009 at 11:46 AM, Ingo Molnar<mingo@...e.hu> wrote:
>> * stephane eranian <eranian@...glemail.com> wrote:
>>
>>> On Wed, Aug 12, 2009 at 11:02 AM, Ingo Molnar<mingo@...e.hu> wrote:
[snip]
>>
>> So yes, both chipset and GPU sampling is very much possible, and it
>> does not require the tweaking of the syscall target parameters -
>> each CPU has a typically symmetric view on it.
>>
>
> Except there can be many GPUs, I/O devices and other pieces of
> hardware with PMU-like capabilities in a single system. In that case,
> you need to be able to name them: I want to measure GPUcycles on
> GPU0. When you are down at that level, you don't really care about
> the CPU or thread. So what would you pass for those in that case?
>
>> Note that there's overlap: a CPU can be an event source and a
>> scheduling target as well. I think some of the confusion in
>> terminology comes from that.
>>
>> To support chipset or GPU sampling, the perf_type_id and/or the
>> struct perf_counter_attr space can be extended.
>>
One solution might be to split up the cpu argument to sys_perf_counter_open into
two fields, for example:
union {
int cpu;
__u32 pmu_type : 12, /* 0 = CPU, 1 = Socket, 2 = GPU,
3-4095 = arch-defined ... */
pmu_number : 20; /* when pmu_type is 0, 0 = CPU0, 1 = CPU1, etc. */
}
If cpu is not equal to -1, then it would be interpreted as the union above. For
the most common case where you are monitoring CPU's, you just pass the cpu
number in without worrying about the upper bits (as long as the cpu number is
less than 2^20!).
As time goes on, I think we will see more chips with fancy integrated GPU's and
other types of accelerators, as well as other off-chip processors/accelerators,
that developers will want to monitor.
- Corey
Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
cjashfor@...ibm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists