[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090810125731.GA5124@nowhere>
Date: Mon, 10 Aug 2009 14:57:32 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Mike Galbraith <efault@....de>,
Paul Mackerras <paulus@...ba.org>
Subject: Re: [PATCH 4/3] perf_counter: Correct PERF_SAMPLE_RAW output
On Mon, Aug 10, 2009 at 11:27:27AM +0200, Peter Zijlstra wrote:
> Subject: perf_counter: Correct PERF_SAMPLE_RAW output
> From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Date: Mon Aug 10 11:16:52 CEST 2009
>
> PERF_SAMPLE_* output switches should unconditionally output the
> correct format, as they are the only way to unambiguously parse the
> PERF_EVENT_SAMPLE data.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> ---
> include/linux/perf_counter.h | 2 ++
> include/trace/ftrace.h | 3 ++-
> kernel/perf_counter.c | 30 ++++++++++++++++++++++++------
> 3 files changed, 28 insertions(+), 7 deletions(-)
>
> Index: linux-2.6/include/linux/perf_counter.h
> ===================================================================
> --- linux-2.6.orig/include/linux/perf_counter.h
> +++ linux-2.6/include/linux/perf_counter.h
> @@ -369,6 +369,8 @@ enum perf_event_type {
> *
> * { u64 nr,
> * u64 ips[nr]; } && PERF_SAMPLE_CALLCHAIN
> + * { u32 size;
> + * char data[size];}&& PERF_SAMPLE_RAW
> * };
> */
> PERF_EVENT_SAMPLE = 9,
> Index: linux-2.6/include/trace/ftrace.h
> ===================================================================
> --- linux-2.6.orig/include/trace/ftrace.h
> +++ linux-2.6/include/trace/ftrace.h
> @@ -687,7 +687,8 @@ static void ftrace_profile_##call(proto)
> pc = preempt_count(); \
> \
> __data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
> - __entry_size = ALIGN(__data_size + sizeof(*entry), sizeof(u64));\
> + __entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
> + sizeof(u64)); \
Here you are reserving a room for the size of the buffer inside the buffer.
> \
> do { \
> char raw_data[__entry_size]; \
> Index: linux-2.6/kernel/perf_counter.c
> ===================================================================
> --- linux-2.6.orig/kernel/perf_counter.c
> +++ linux-2.6/kernel/perf_counter.c
> @@ -2647,7 +2647,6 @@ void perf_counter_output(struct perf_cou
> u64 counter;
> } group_entry;
> struct perf_callchain_entry *callchain = NULL;
> - struct perf_raw_record *raw = NULL;
> int callchain_size = 0;
> u64 time;
> struct {
> @@ -2717,9 +2716,15 @@ void perf_counter_output(struct perf_cou
> }
>
> if (sample_type & PERF_SAMPLE_RAW) {
> - raw = data->raw;
> - if (raw)
> - header.size += raw->size;
> + int size = sizeof(u32);
> +
> + if (data->raw)
> + size += data->raw->size;
And here you reserve a size for the buffer once again?
> + else
> + size += sizeof(u32);
> +
> + WARN_ON_ONCE(size & (sizeof(u64)-1));
> + header.size += size;
> }
>
> ret = perf_output_begin(&handle, counter, header.size, nmi, 1);
> @@ -2785,8 +2790,21 @@ void perf_counter_output(struct perf_cou
> }
> }
>
> - if ((sample_type & PERF_SAMPLE_RAW) && raw)
> - perf_output_copy(&handle, raw->data, raw->size);
> + if (sample_type & PERF_SAMPLE_RAW) {
> + if (data->raw) {
> + perf_output_put(&handle, data->raw->size);
> + perf_output_copy(&handle, data->raw->data, data->raw->size);
And actually you copy the buffer that has the size of the buffer
plus the u32 reserved for the size, whereas the size has already been
copied.
When I look at a perf sample raw it gives me the following:
. 0000: 09 00 00 00 01 00 54 00 d0 c7 00 81 ff ff ff ff ......T........
. 0010: 69 01 00 00 69 01 00 00 e6 15 00 00 00 00 00 00 i...i..........
. 0020: 30 00 00 00 2b 00 01 02 69 01 00 00 69 01 00 00 0...+...i...i..
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ .................
Buf size struct trace_entry
. 0030: 6b 6f 6e 64 65 6d 61 6e 64 2f 31 00 00 00 00 00 kondemand/1....
^ ^ ^ ^ ..................................
Rest of struct ftrace_raw_<call>
. 0040: 69 01 00 00 70 82 46 81 ff ff ff ff 00 00 00 00 i...p.F........
.................................. ^ ^ ^ ^
The room you have
reserved for the buffer size
(zeroed from a tmp patch)
. 0050: 00 00 00 00
^ ^ ^ ^
padding from alignment (ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
sizeof(u64));
I first thought the 0x4c bytes was a padding from gcc because
sizeof(struct ftrace_raw_<call>) % 8 != 0
But no, it is equal to 0: It's between 0x4c and 0x24 = 40 bytes.
I'm preparing a patch to fix this. Just wanted to expose my idea here,
because I'm perhaps wrong in the middle of this dump.
I'll do the alignment right in the end, just before inserting the entry
in the output. That will be more easy. I'll zeroe the rest in the same time.
> + } else {
> + struct {
> + u32 size;
> + u32 data;
> + } raw = {
> + .size = sizeof(u32),
> + .data = 0,
> + };
> + perf_output_put(&handle, raw);
> + }
> + }
>
> perf_output_end(&handle);
> }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists