[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <abe1cbab-d010-0899-acb1-32b1f9bf17df@linux.intel.com>
Date: Mon, 5 Nov 2018 10:43:41 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Stephane Eranian <eranian@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Borislav Petkov <bp@...en8.de>, Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH 1/2] perf: Add munmap callback
On 11/5/2018 5:59 AM, Stephane Eranian wrote:
> Hi Kan,
>
> I built a small test case for you to demonstrate the issue for code and data.
> Compile the test program and then do:
> For text:
> $ perf record ./mmap
> $ perf report -D | fgrep MMAP2
>
> The test program mmaps 2 pages, unmaps the second, and remap 1 page
> over the freed space.
> If you look at the MMAP2 record, you will not be able to reconstruct
> what happened and perf will
> get confused should it try to symbolize from the address range
>
> With Text:
> PERF_RECORD_MMAP2 5937/5937: [0x400000(0x1000) @ 0 08:01 400938
> 824817672]: r-xp /home/eranian/mmap
> PERF_RECORD_MMAP2 5937/5937: [0x7f7c01019000(0x2000) @ 0x7f7c01019000
> 00:00 0 0]: rwxp //anon
> PERF_RECORD_MMAP2 5937/5937: [0x7f7c01019000(0x2000) @ 0x7f7c01019000
> 00:00 0 0]: rwxp //anon
>
> ^^^^^^^^^^^^^^^^^^^^^^^^ captures the whole VMA but not the mapping
> change in user space
>
> For data:
> $ perf record -d ./mmap
> $ perf report -D | fgrep MMAP2
> With data:
> PERF_RECORD_MMAP2 6430/6430: [0x400000(0x1000) @ 0 08:01 400938
> 3278843184]: r-xp /home/eranian/mmap
> PERF_RECORD_MMAP2 6430/6430: [0x7f4aa704b000(0x2000) @ 0x7f4aa704b000
> 00:00 0 0]: rw-p //anon
> PERF_RECORD_MMAP2 6430/6430: [0x7f4aa704b000(0x2000) @ 0x7f4aa704b000
> 00:00 0 0]: rw-p //anon
>
> Same test case with data.
> Perf will think the entire 2 pages have been replaced when in fact
> only the second has.
> I believe the problem is likely to impact data and jitted code cache
>
> #include <sys/types.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <err.h>
> #include <getopt.h>
>
> int main(int argc, char **argv)
> {
> void *addr1, *addr2;
> size_t pgsz = sysconf(_SC_PAGESIZE);
> int n = 2;
> int ret;
> int c, mode = 0;
>
> while ((c = getopt(argc, argv, "hd")) != -1) {
> switch (c) {
> case 'h':
> printf("[-h]\tget this help\n");
> printf("[-d]\tuse data mmaps (no PROT_EXEC)\n");
> return 0;
> case 'd':
> mode = PROT_EXEC;
> break;
> default:
> errx(1, "unknown option");
> }
> }
> /* default to data */
> if (mode == 0)
> mode = PROT_WRITE;
>
> /*
> * mmap 2 contiugous pages
> */
> addr1 = mmap(NULL, n * pgsz, PROT_READ| mode, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
> if (addr1 == (void *)MAP_FAILED)
> err(1, "mmap 1 failed");
>
> printf("addr1=[%p : %p]\n", addr1, addr1 + n * pgsz);
>
> /*
> * unmap only the second page
> */
> ret = munmap(addr1 + pgsz, pgsz);
> if (ret == -1)
> err(1, "munmp failed");
>
> /*
> * mmap 1 page at the location of the unmap page (should reuse virtual space)
> * This creates a continuous region built from two mmaps and
> potentially two different sources
> * especially with jitted runtimes
> */
The two mmaps are both anon. As my understanding, we cannot symbolize
from the anonymous address, can we?
If we cannot, why we have to distinguish with them? I think we do not
need to know their sources for symbolization.
As my understanding, only --jit can inject MMAP event, which tag an
anon. Perf can symbolize the address after that. Then the unmap is needed.
Thanks,
Kan
> addr2 = mmap(addr1 + pgsz, 1 * pgsz, PROT_READ|PROT_WRITE | mode,
> MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
>
> printf("addr2=%p\n", addr2);
>
> if (addr2 == (void *)MAP_FAILED)
> err(1, "mmap 2 failed");
> if (addr2 != (addr1 + pgsz))
> errx(1, "wrong mmap2 address");
>
> sleep(1);
>
> return 0;
> }
>
> On Thu, Nov 1, 2018 at 7:10 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>
>>
>>
>> On 10/24/2018 3:30 PM, Stephane Eranian wrote:
>>> The need for this new record type extends beyond physical address conversions
>>> and PEBS. A long while ago, someone reported issues with symbolization related
>>> to perf lacking munmap tracking. It had to do with vma merging. I think the
>>> sequence of mmaps was as follows in the problematic case:
>>> 1. addr1 = mmap(8192);
>>> 2. munmap(addr1 + 4096, 4096)
>>> 3. addr2 = mmap(addr1+4096, 4096)
>>>
>>> If successful, that yields addr2 = addr1 + 4096 (could also get the
>>> same without forcing the address).
>>>
>>> In that case, if I recall correctly, the vma for 1st mapping (now at
>>> 4k) and that of the 2nd mapping (4k)
>>> get merged into a single 8k vma and this is what perf_events will
>>> record for PERF_RECORD_MMAP.
>>> On the perf tool side, it is assumed that if two timestamped mappings
>>> overlap then, the latter overrides
>>> the former. In this case, perf would loose the mapping of the first
>>> 4kb and assume all symbols comes from
>>> 2nd mapping. Hopefully I got the scenario right. If so, then you'd
>>> need PERF_RECORD_UNMAP to
>>> disambiguate assuming the perf tool is modified accordingly.
>>>
>>
>> Hi Stephane and Peter,
>>
>> I went through the link(https://lkml.org/lkml/2017/1/27/452). I'm trying
>> to understand the problematic case.
>>
>> It looks like the issue can only be triggered by perf inject --jit.
>> Because it can inject extra MMAP events.
>> As my understanding, Linux kernel only try to merge VMAs if they are
>> both from anon or they are both from the same file. --jit breaks the
>> rule, and makes the merged VMA partly from anon, partly from file.
>> Now, there is a new MMAP event which range covers the modified VMA.
>> Without the help of MUNMAP event, perf tool have no idea if the new one
>> is a newly merged VMA (modified VMA + a new VMA) or a brand new VMA.
>> Current code just simply overwrite the modified VMAs. The VMA
>> information which --jit injected may be lost. The symbolization may be
>> lost as well.
>>
>> Except --jit, the VMAs information should be consistent between kernel
>> and perf tools. We shouldn't observe the problem. MUNMAP event is not
>> needed.
>>
>> Is my understanding correct?
>>
>> Do you have a test case for the problem?
>>
>> Thanks,
>> Kan
Powered by blists - more mailing lists