[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBRLF6-EhXik5aRnWnCXXycN9oVM4_oFND77ntavjsfpcA@mail.gmail.com>
Date: Thu, 26 Jan 2017 14:51:02 -0800
From: Stephane Eranian <eranian@...gle.com>
To: Andres Freund <andres@...razel.de>
Cc: Peter Zijlstra <peterz@...radead.org>,
"eranian@...il.com" <eranian@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...e.hu>,
Anton Blanchard <anton@...abs.org>,
Namhyung Kim <namhyung@...nel.org>
Subject: Re: perf/jit doesn't cope well with mprotect() to jit containing pages
On Thu, Jan 26, 2017 at 2:38 PM, Andres Freund <andres@...razel.de> wrote:
> On 2017-01-26 14:26:12 -0800, Stephane Eranian wrote:
>> On Thu, Jan 26, 2017 at 2:19 PM, Peter Zijlstra <peterz@...radead.org> wrote:
>> > On Thu, Jan 26, 2017 at 01:00:52PM -0800, Andres Freund wrote:
>> >> The problem is that (quoted below) without that hack the subsequent
>> >> mmaps just expand the previous VMAs which leads to perf loosing its
>> >> (address,range) -> symbol mappings for previously (in the same expanded
>> >> range) known symbols.
>> >
>> > I'm assuming this is the userspace tool doing that, right?
>> >
>> > Acme, is that something we can cure? Instead of throwing away eveything
>> > known of the previous map, merge them when they grow?
>> >
>> > So I think the problem was that we don't track munmap(), so if you see
>> > two overlapping mappings, we have no clue and assume an munmap() has
>> > happened or something. I can't really remember, its been many many years
>> > since I wrote all that :-(
>>
>> But perf report takes care of overlapping mmaps normally,
>> see util/map.c:map_groups__fixup_overlappings
>>
>> VMA merging does not change anything at the user level.
>> The kernel will see two mmaps and so will perf record and perf report.
>
> But the kernel *didn't* report two independent mmaps.
>
> Looking at strace shows that the mmaps were done independently in page
> sized chunks:
>
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866e000
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866d000
> mprotect(0x7efd3866e000, 4096, PROT_READ|PROT_EXEC) = 0
> mprotect(0x7efd3866d000, 4096, PROT_READ|PROT_EXEC) = 0
> write(2, "Function loaded: evalexpr0 at 139626038091776 0x7efd3866e000 len 69", 68) = 68
>
> (first function)
>
> mmap(0x7efd3866f000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866c000
> mmap(0x7efd3866e000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866b000
> mprotect(0x7efd3866c000, 4096, PROT_READ|PROT_EXEC) = 0
> mprotect(0x7efd3866b000, 4096, PROT_READ|PROT_EXEC) = 0
> write(2, "Function loaded: evalexpr1 at 139626038083584 0x7efd3866c000 len 69", 68) = 68
>
> (second function)
>
>
> But then looking at the unannotated perf script --show-mmap-events:
>
> postgres 4107 595444.867737: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866e000(0x1000) @ 0x7efd3866e000 00:00 0 0]: ---p //anon
> postgres 4107 595444.867825: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866d000(0x2000) @ 0x7efd3866d000 00:00 0 0]: ---p //anon
> postgres 4107 595444.884090: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866c000(0x3000) @ 0x7efd3866c000 00:00 0 0]: ---p //anon
> postgres 4107 595444.884113: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866b000(0x4000) @ 0x7efd3866b000 00:00 0 0]: ---p //anon
>
> these four mmaps regions are:
> 0x7efd3866e000(0x1000): 0x7efd3866e000 - 0x7efd3866f000
> 0x7efd3866d000(0x2000): 0x7efd3866d000 - 0x7efd3866f000
> 0x7efd3866c000(0x3000): 0x7efd3866c000 - 0x7efd3866f000
> 0x7efd3866b000(0x4000): 0x7efd3866b000 - 0x7efd3866f000
>
> So there's just a single mapping being reported.
>
>
>> This is what you have.
>
> I'm a bit confused - isn't the trace showing exactly that the mmaps
> *are* reported with a merged VMA?
>
>
>> In the case of Java, if I recall, there is a single large mmap for the
>> code cache and it works fine.
>
> That explains why it's working there...
>
Ok, I think I see the problem now (sorry was slow....):
the timeline is as follows as seen from the user in your case:
T0: mmap(0x1000) for func1()
T1 mmap(0x2000) for func1();
T3: jit emits info func1() [0x1000-0x1fff]
T4: mmap(0x3000) for func2()
T5: mmap(0x4000) for funcs2()
T6: jit emits info for func2() [0x2000-0x3fff]
But the problem is that each mmap covers existing mmaps and thus
supersedes the others as per the time stamp.
The problem is not specific to jit, it just reveals itself in your case.
The logic in perf is that a more recent mmap supersedes an older one,
so you have:
T3: 0x1000-0x2000 owned by func1
T4: 0x1000-0x3000 owned by anon
T5: 0x1000-0x4000 owned by anon
T6: 0x1000-0x4000 owned partially by func2()
And thus perf cannot symbolize func1() anymore because it has nothing
mapped in 0x1000-0x1fff but anon.
Did I get the problem right this time?
This is tricky to solve here because the tool does not know about the
merging of the VMAs and assume you are overlapping mmaps and not
merging them.
Again the problem is not specific to jit, merging of VMA can happen
anytime with any app.
> Greetings,
>
> Andres Freund
Powered by blists - more mailing lists