lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 31 Aug 2017 16:31:36 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
        Namhyung Kim <namhyung@...nel.org>,
        David Ahern <dsahern@...il.com>, Jiri Olsa <jolsa@...hat.com>,
        Minchan Kim <minchan@...nel.org>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>, linux-mm@...ck.org
Subject: Re: [PATCH 1/5] tracing, mm: Record pfn instead of pointer to struct
 page

On 08/31/2017 03:43 PM, Steven Rostedt wrote:
> On Mon, 31 Jul 2017 09:43:41 +0200 Vlastimil Babka <vbabka@...e.cz> wrote:
> 
>> On 04/14/2015 12:14 AM, Arnaldo Carvalho de Melo wrote:
>>> From: Namhyung Kim <namhyung@...nel.org>
>>>
>>> The struct page is opaque for userspace tools, so it'd be better to save
>>> pfn in order to identify page frames.
>>>
>>> The textual output of $debugfs/tracing/trace file remains unchanged and
>>> only raw (binary) data format is changed - but thanks to libtraceevent,
>>> userspace tools which deal with the raw data (like perf and trace-cmd)
>>> can parse the format easily.  
>>
>> Hmm it seems trace-cmd doesn't work that well, at least on current
>> x86_64 kernel where I noticed it:
>>
>>  trace-cmd-22020 [003] 105219.542610: mm_page_alloc:        [FAILED TO PARSE] pfn=0x165cb4 order=0 gfp_flags=29491274 migratetype=1
> 
> Which version of trace-cmd failed? It parses for me. Hmm, the
> vmemmap_base isn't in the event format file. It's the actually address.
> That's probably what failed to parse.

Mine says 2.6. With 4.13-rc6 I get FAILED TO PARSE.

> 
>>
>> I'm quite sure it's due to the "page=%p" part, which uses pfn_to_page().
>> The events/kmem/mm_page_alloc/format file contains this for page:
>>
>> REC->pfn != -1UL ? (((struct page *)vmemmap_base) + (REC->pfn)) : ((void *)0)
> 
> But yeah, I think the output is wrong. I just ran this:
> 
>  page=0xffffea00000a62f4 pfn=680692 order=0 migratetype=0 gfp_flags=GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK
> 
> But running it with trace-cmd report -R (raw format):
> 
>  mm_page_alloc:         pfn=0xa62f4 order=0 gfp_flags=24150208 migratetype=0
> 
> The parser currently ignores types, so it doesn't do pointer
> arithmetic correctly, and would be hard to here as it doesn't know the
> size of the struct page. What could work is if we changed the printf
> fmt to be:
> 
>   (unsigned long)(0xffffea0000000000UL) + (REC->pfn * sizeof(struct page))
> 
> 
>>
>> I think userspace can't know vmmemap_base nor the implied sizeof(struct
>> page) for pointer arithmetic?
>>
>> On older 4.4-based kernel:
>>
>> REC->pfn != -1UL ? (((struct page *)(0xffffea0000000000UL)) + (REC->pfn)) : ((void *)0)
> 
> This is what I have on 4.13-rc7
> 
>>
>> This also fails to parse, so it must be the struct page part?
> 
> Again, what version of trace-cmd do you have?

On the older distro it was 2.0.4

> 
>>
>> I think the problem is, even if ve solve this with some more
>> preprocessor trickery to make the format file contain only constant
>> numbers, pfn_to_page() on e.g. sparse memory model without vmmemap is
>> more complicated than simple arithmetic, and can't be exported in the
>> format file.
>>
>> I'm afraid that to support userspace parsing of the trace data, we will
>> have to store both struct page and pfn... or perhaps give up on reporting
>> the struct page pointer completely. Thoughts?
> 
> Had some thoughts up above.

Yeah, it could be made to work for some configurations, but see the part
about "sparse memory model without vmemmap" above.

> -- Steve
> 

Powered by blists - more mailing lists