[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5501E54C.2090107@gmail.com>
Date: Thu, 12 Mar 2015 13:13:16 -0600
From: David Ahern <dsahern@...il.com>
To: Stephane Eranian <eranian@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>
CC: LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Jiri Olsa <jolsa@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"mingo@...e.hu" <mingo@...e.hu>
Subject: Re: [BUG] perf report: ordered events and flushing bug
On 3/12/15 1:05 PM, Stephane Eranian wrote:
> On Thu, Mar 12, 2015 at 5:02 AM, Adrian Hunter <adrian.hunter@...el.com> wrote:
>> On 12/03/15 05:32, Stephane Eranian wrote:
>>> Hi,
>>>
>>> I am working on the JIT support to improve the flow and have
>>> perf record inject the MMAPs at the end of the collection. For
>>> that I piggyback on the buildid pass. To avoid rewriting the entire perf.data
>>> file, I simply append the the MMAP records at the end of the file. And that
>>> puts them out-of-order in time systematically. But I thought it would be okay
>>> because perf report would sort them by timestamps again.
>>>
>>> Well, it does not in all cases! Why?
>>>
>>> Because of the round flushing. Based on how far out-of-order an event is,
>>> it may not be processed correctly because of round flushing. I believe this
>>> may only impact MMAP records. This is a serious issue because
>>> mmaps drive symbolization of samples. If samples are processed without the
>>> proper dso mapping, then samples may not be symbolized or may be wrongly
>>> symbolized.
>>>
>>> So far, the workaround I found was to set the oe->next_flush = 0 for
>>> the ROUND mode.
>>> In other words, do not flush anything until FINAL. To me, this is the
>>> only sensible
>>> way of avoiding this kind of problems. I am not sure I understand the
>>> point of flushing
>>> anyway, except to minimize memory footprint, maybe. But it does not
>>> work with vastly
>>> out-of-order mmaps.
Rounds and flushing after them helps with the user experience -- at
least for some commands. On systems with 1024 cpus perf data files get
large quickly and the resulting analysis command can appear to hang for
long periods (e.g., i have done 1 second data collections that generate
5GB files and take 30-45 minutes before perf shows any output).
I have a patch to perf-record to allow a user to specify the poll
timeout. That helps define the maximum time between 'rounds' on the
record side and then commands that show each event (e.g, perf script)
will show output much more quickly without the need to consume large
amounts of memory or process the entire file before displaying events.
>>>
>>> Do you have a better solution?
>>
>> You could hook the ordered event delivery (see Arnaldo's perf/core branch):
>>
>> if (tool->ordered_events) {
>> inject->deliver = session->ordered_events.deliver;
>> session->ordered_events.deliver = jit_mmap_deliver;
>> }
>>
>> int jit_mmap_deliver(struct ordered_events *oe, struct ordered_event *event, struct perf_sample *sample)
>> {
>> struct perf_inject *inject = container_of(oe->tool, struct perf_inject, tool);
>>
>> if (next_jit_mmap_time < sample->time) {
>> <synth mmap event>
>> perf_session__deliver_synth_event(...);
>> }
>> return inject->deliver(oe, event, sample);
>> }
>>
> I don't quite understand what this does?
> Are you saying that in perf/core there is a new callback mechanism to
> override the deliver phase
> when you need ordered event and that callback could be used to avoid
> the flushing?
>
I think the suggestion is to push the mmap events into the stream at the
appropriate time slot.
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists