[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51118797.9080800@linaro.org>
Date: Tue, 05 Feb 2013 14:28:39 -0800
From: John Stultz <john.stultz@...aro.org>
To: Stephane Eranian <eranian@...gle.com>
CC: Pawel Moll <pawel.moll@....com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
"mingo@...e.hu" <mingo@...e.hu>, Paul Mackerras <paulus@...ba.org>,
Anton Blanchard <anton@...ba.org>,
Will Deacon <Will.Deacon@....com>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
Pekka Enberg <penberg@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Robert Richter <robert.richter@....com>,
tglx <tglx@...utronix.de>
Subject: Re: [RFC] perf: need to expose sched_clock to correlate user samples
with kernel samples
On 02/05/2013 02:13 PM, Stephane Eranian wrote:
> On Fri, Feb 1, 2013 at 3:18 PM, Pawel Moll <pawel.moll@....com> wrote:
>> Hello,
>>
>> I'd like to revive the topic...
>>
>> On Tue, 2012-10-16 at 18:23 +0100, Peter Zijlstra wrote:
>>> On Tue, 2012-10-16 at 12:13 +0200, Stephane Eranian wrote:
>>>> Hi,
>>>>
>>>> There are many situations where we want to correlate events happening at
>>>> the user level with samples recorded in the perf_event kernel sampling buffer.
>>>> For instance, we might want to correlate the call to a function or creation of
>>>> a file with samples. Similarly, when we want to monitor a JVM with jitted code,
>>>> we need to be able to correlate jitted code mappings with perf event samples
>>>> for symbolization.
>>>>
>>>> Perf_events allows timestamping of samples with PERF_SAMPLE_TIME.
>>>> That causes each PERF_RECORD_SAMPLE to include a timestamp
>>>> generated by calling the local_clock() -> sched_clock_cpu() function.
>>>>
>>>> To make correlating user vs. kernel samples easy, we would need to
>>>> access that sched_clock() functionality. However, none of the existing
>>>> clock calls permit this at this point. They all return timestamps which are
>>>> not using the same source and/or offset as sched_clock.
>>>>
>>>> I believe a similar issue exists with the ftrace subsystem.
>>>>
>>>> The problem needs to be adressed in a portable manner. Solutions
>>>> based on reading TSC for the user level to reconstruct sched_clock()
>>>> don't seem appropriate to me.
>>>>
>>>> One possibility to address this limitation would be to extend clock_gettime()
>>>> with a new clock time, e.g., CLOCK_PERF.
>>>>
>>>> However, I understand that sched_clock_cpu() provides ordering guarantees only
>>>> when invoked on the same CPU repeatedly, i.e., it's not globally synchronized.
>>>> But we already have to deal with this problem when merging samples obtained
>>>> from different CPU sampling buffer in per-thread mode. So this is not
>>>> necessarily
>>>> a showstopper.
>>>>
>>>> Alternatives could be to use uprobes but that's less practical to setup.
>>>>
>>>> Anyone with better ideas?
>>> You forgot to CC the time people ;-)
>>>
>>> I've no problem with adding CLOCK_PERF (or another/better name).
>>>
>>> Thomas, John?
>> I've just faced the same issue - correlating an event in userspace with
>> data from the perf stream, but to my mind what I want to get is a value
>> returned by perf_clock() _in the current "session" context_.
>>
>> Stephane didn't like the idea of opening a "fake" perf descriptor in
>> order to get the timestamp, but surely one must have the "session"
>> already running to be interested in such data in the first place? So I
>> think the ioctl() idea is not out of place here... How about the simple
>> change below?
>>
> The app requesting the timestamp may not necessarily have an active
> perf session. And by that I mean, it may not be self-monitoring. But it
> could be monitored by an external tool such as perf, without necessary
> knowing it.
>
> The timestamp is global or at least per-cpu. It is not tied to a particular
> active event.
>
> The thing I did not like about ioctl() is that it now means that the app
> needs to become a user of the perf_event API. It needs to program
> a dummy event just to get a timestamp. As opposed to just calling
> a clock_gettime(CLOCK_PERF) function which guarantees a clock
> source identical to that used by perf_events. In that case, the app
> timestamps its events in such a way that if it was monitored externally,
> that external tool would be able to correlate all the samples because they
> would all have the same time source.
>
> But if people are strongly opposed to the clock_gettime() approach, then
> I can go with the ioctl() because the functionality is definitively needed
> ASAP.
I prefer the ioctl method, since its less likely to be re-purposed/misused.
Though I'd be most comfortable with finding some way for perf-timestamps
to be CLOCK_MONOTONIC based (or maybe CLOCK_MONOTONIC_RAW if it would be
easier), and just avoid all together adding another time domain that
doesn't really have clear definition (other then "what perf uses").
thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists