linux-kernel - Re: [BUG] perf report: ordered events and flushing bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABPqkBQdvMe=vft4tuN14FrWApBHxa=wjOxjrTnkc4Uga6HGpQ@mail.gmail.com>
Date:	Thu, 12 Mar 2015 15:23:00 -0400
From:	Stephane Eranian <eranian@...gle.com>
To:	David Ahern <dsahern@...il.com>
Cc:	Adrian Hunter <adrian.hunter@...el.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Namhyung Kim <namhyung@...nel.org>,
	Jiri Olsa <jolsa@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"mingo@...e.hu" <mingo@...e.hu>
Subject: Re: [BUG] perf report: ordered events and flushing bug

On Thu, Mar 12, 2015 at 3:13 PM, David Ahern <dsahern@...il.com> wrote:
> On 3/12/15 1:05 PM, Stephane Eranian wrote:
>>
>> On Thu, Mar 12, 2015 at 5:02 AM, Adrian Hunter <adrian.hunter@...el.com>
>> wrote:
>>>
>>> On 12/03/15 05:32, Stephane Eranian wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am working on the JIT support to improve the flow and have
>>>> perf record inject the MMAPs at the end of the collection. For
>>>> that I piggyback on the buildid pass. To avoid rewriting the entire
>>>> perf.data
>>>> file, I simply append the the MMAP records at the end of the file. And
>>>> that
>>>> puts them out-of-order in time systematically. But I thought it would be
>>>> okay
>>>> because perf report would sort them by timestamps again.
>>>>
>>>> Well, it does not in all cases! Why?
>>>>
>>>> Because of the round flushing. Based on how far out-of-order an event
>>>> is,
>>>> it may not be processed correctly because of round flushing. I believe
>>>> this
>>>> may only impact MMAP records.  This is a serious issue because
>>>> mmaps drive symbolization of samples. If samples are processed without
>>>> the
>>>> proper dso mapping, then samples may not be symbolized or may be wrongly
>>>> symbolized.
>>>>
>>>> So far, the workaround I found was to set the oe->next_flush = 0 for
>>>> the ROUND mode.
>>>> In other words, do not flush anything until FINAL. To me, this is the
>>>> only sensible
>>>> way of avoiding this kind of problems. I am not sure I understand the
>>>> point of flushing
>>>> anyway, except to minimize memory footprint, maybe. But it does not
>>>> work with vastly
>>>> out-of-order mmaps.
>
>
> Rounds and flushing after them helps with the user experience -- at least
> for some commands. On systems with 1024 cpus perf data files get large
> quickly and the resulting analysis command can appear to hang for long
> periods (e.g., i have done 1 second data collections that generate 5GB files
> and take 30-45 minutes before perf shows any output).
>
> I have a patch to perf-record to allow a user to specify the poll timeout.
> That helps define the maximum time between 'rounds' on the record side and
> then commands that show each event (e.g, perf script) will show output much
> more quickly without the need to consume large amounts of memory or process
> the entire file before displaying events.
>
I can understand this but not at the cost of correctness. You have no guarantee
that events will appear in the perf.data with a fixed maximum time window, i.e.,
beyond the window, out-of-order events may be discarded.
There ought to be a better way of keeping the user aware of what's going on
with the progress bar.


>>>>
>>>> Do you have a better solution?
>>>
>>>
>>> You could hook the ordered event delivery (see Arnaldo's perf/core
>>> branch):
>>>
>>> if (tool->ordered_events) {
>>>          inject->deliver = session->ordered_events.deliver;
>>>          session->ordered_events.deliver = jit_mmap_deliver;
>>> }
>>>
>>> int jit_mmap_deliver(struct ordered_events *oe, struct ordered_event
>>> *event, struct perf_sample *sample)
>>> {
>>>          struct perf_inject *inject = container_of(oe->tool, struct
>>> perf_inject, tool);
>>>
>>>          if (next_jit_mmap_time < sample->time) {
>>>                  <synth mmap event>
>>>                  perf_session__deliver_synth_event(...);
>>>          }
>>>          return inject->deliver(oe, event, sample);
>>> }
>>>
>> I don't quite understand what this does?
>> Are you saying that in perf/core there is a new callback mechanism to
>> override the deliver phase
>> when you need ordered event and that callback could be used to avoid
>> the flushing?
>>
>
> I think the suggestion is to push the mmap events into the stream at the
> appropriate time slot.
>
> David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/