[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1359728280.8360.15.camel@hornet>
Date: Fri, 01 Feb 2013 14:18:00 +0000
From: Pawel Moll <pawel.moll@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Stephane Eranian <eranian@...gle.com>,
LKML <linux-kernel@...r.kernel.org>,
"mingo@...e.hu" <mingo@...e.hu>, Paul Mackerras <paulus@...ba.org>,
Anton Blanchard <anton@...ba.org>,
Will Deacon <Will.Deacon@....com>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
Pekka Enberg <penberg@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Robert Richter <robert.richter@....com>,
tglx <tglx@...utronix.de>, John Stultz <john.stultz@...aro.org>
Subject: Re: [RFC] perf: need to expose sched_clock to correlate user
samples with kernel samples
Hello,
I'd like to revive the topic...
On Tue, 2012-10-16 at 18:23 +0100, Peter Zijlstra wrote:
> On Tue, 2012-10-16 at 12:13 +0200, Stephane Eranian wrote:
> > Hi,
> >
> > There are many situations where we want to correlate events happening at
> > the user level with samples recorded in the perf_event kernel sampling buffer.
> > For instance, we might want to correlate the call to a function or creation of
> > a file with samples. Similarly, when we want to monitor a JVM with jitted code,
> > we need to be able to correlate jitted code mappings with perf event samples
> > for symbolization.
> >
> > Perf_events allows timestamping of samples with PERF_SAMPLE_TIME.
> > That causes each PERF_RECORD_SAMPLE to include a timestamp
> > generated by calling the local_clock() -> sched_clock_cpu() function.
> >
> > To make correlating user vs. kernel samples easy, we would need to
> > access that sched_clock() functionality. However, none of the existing
> > clock calls permit this at this point. They all return timestamps which are
> > not using the same source and/or offset as sched_clock.
> >
> > I believe a similar issue exists with the ftrace subsystem.
> >
> > The problem needs to be adressed in a portable manner. Solutions
> > based on reading TSC for the user level to reconstruct sched_clock()
> > don't seem appropriate to me.
> >
> > One possibility to address this limitation would be to extend clock_gettime()
> > with a new clock time, e.g., CLOCK_PERF.
> >
> > However, I understand that sched_clock_cpu() provides ordering guarantees only
> > when invoked on the same CPU repeatedly, i.e., it's not globally synchronized.
> > But we already have to deal with this problem when merging samples obtained
> > from different CPU sampling buffer in per-thread mode. So this is not
> > necessarily
> > a showstopper.
> >
> > Alternatives could be to use uprobes but that's less practical to setup.
> >
> > Anyone with better ideas?
>
> You forgot to CC the time people ;-)
>
> I've no problem with adding CLOCK_PERF (or another/better name).
>
> Thomas, John?
I've just faced the same issue - correlating an event in userspace with
data from the perf stream, but to my mind what I want to get is a value
returned by perf_clock() _in the current "session" context_.
Stephane didn't like the idea of opening a "fake" perf descriptor in
order to get the timestamp, but surely one must have the "session"
already running to be interested in such data in the first place? So I
think the ioctl() idea is not out of place here... How about the simple
change below?
Regards
Pawel
8<---
>From 2ad51a27fbf64bf98cee190efc3fbd7002819692 Mon Sep 17 00:00:00 2001
From: Pawel Moll <pawel.moll@....com>
Date: Fri, 1 Feb 2013 14:03:56 +0000
Subject: [PATCH] perf: Add ioctl to return current time value
To co-relate user space events with the perf events stream
a current (as in: "what time(stamp) is it now?") time value
must be made available.
This patch adds a perf ioctl that makes this possible.
Signed-off-by: Pawel Moll <pawel.moll@....com>
---
include/uapi/linux/perf_event.h | 1 +
kernel/events/core.c | 8 ++++++++
2 files changed, 9 insertions(+)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 4f63c05..b745fb0 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -316,6 +316,7 @@ struct perf_event_attr {
#define PERF_EVENT_IOC_PERIOD _IOW('$', 4, __u64)
#define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
#define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
+#define PERF_EVENT_IOC_GET_TIME _IOR('$', 7, __u64)
enum perf_event_ioc_flags {
PERF_IOC_FLAG_GROUP = 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 301079d..4202b1c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3298,6 +3298,14 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case PERF_EVENT_IOC_SET_FILTER:
return perf_event_set_filter(event, (void __user *)arg);
+ case PERF_EVENT_IOC_GET_TIME:
+ {
+ u64 time = perf_clock();
+ if (copy_to_user((void __user *)arg, &time, sizeof(time)))
+ return -EFAULT;
+ return 0;
+ }
+
default:
return -ENOTTY;
}
--
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists