linux-kernel - Re: Ftrace vs perf user page fault statistics differences

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAE+MWFvA-VEDFnPSCS9y+kAH=SXQJnfhqYB+oxfp2FFJccdfUg@mail.gmail.com>
Date:   Wed, 14 Jun 2017 13:47:17 -0400
From:   Will Hawkins <whh8b@...ginia.edu>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Namhyung Kim <namhyung@...il.com>
Subject: Re: Ftrace vs perf user page fault statistics differences

On Tue, Jun 13, 2017 at 10:13 PM, Steven Rostedt <rostedt@...dmis.org> wrote:
> On Tue, 13 Jun 2017 21:27:20 -0400
> Will Hawkins <whh8b@...ginia.edu> wrote:
>
>> But, while I have your ear, I was wondering if you could direct me to
>> where the page_fault_user gathers its information (when tracing) and
>> where that information is formatted for printing (in trace-cmd). I'd
>> really like to investigate modifying that code so that it provides
>> "better" information than
>>
>> address=__per_cpu_end ip=__per_cpu_end
>>
>> Now that I'm into this, I want to really dig in. If you can give me a
>> pointer, that would really help me get started. After that I can
>> attempt to make some patches and see where it leads.
>>
>> For the x86 architecture, the relevant files seem to be
>>
>> mm/fault.c (with the trace_page_fault_entries function)
>>
>> and
>>
>> include/asm/trace/exceptions.h (where the exception class is built)
>
> Correct.
>
>>
>> What seems odd there is that the exception *should* be responding with
>> "better" information -- address is set to the value of the cr2
>> register which should contain the address of the faulting instruction.
>> Perhaps I just don't understand what the symbol __per_cpu_end means.
>>
>> Any information you can shed on this would be really great.
>
> hmm, actually that looks to be translating the ip address into
> functions. It shouldn't be doing that, and it doesn't do it for me with
> the latest kernel and trace-cmd.
>
> What does it give you in trace-cmd report -R ? The -R will not parse
> the printk-fmt of the event format files, and just show raw numbers.

Brilliant advice. And, guess what? It gives me exactly what I expected:

page-7783    6d... 198598.003464: page_fault_user:
address=0x4000e0 ip=0x4000e0 error_code=0x14
...
page-7783    6d... 198598.003466: page_fault_user:
address=0x401473 ip=0x401473 error_code=0x14

Since I don't expect you to be following the details as closely as I
am, those are exactly the addresses that I expected to see. I think
that is the final confirmation of the hypothesis!

>
> Can you give me the contents of:
>
>  cat /sys/kernel/debug/tracing/exceptions/page_fault_user/format
>
> ?
>
> That's how trace-cmd parses it.

In the kernel version that I am running (again, pretty old) I do not
have this file. I do, however, have

/sys/kernel/debug/tracing/events/exceptions/page_fault_user/format

and the contents are:

name: page_fault_user
ID: 79
format:
    field:unsigned short common_type;    offset:0;    size:2;    signed:0;
    field:unsigned char common_flags;    offset:2;    size:1;    signed:0;
    field:unsigned char common_preempt_count;    offset:3;    size:1;signed:0;
    field:int common_pid;    offset:4;    size:4;    signed:1;

    field:unsigned long address;    offset:8;    size:8;    signed:0;
    field:unsigned long ip;    offset:16;    size:8;    signed:0;
    field:unsigned long error_code;    offset:24;    size:8;    signed:0;

print fmt: "address=%pf ip=%pf error_code=0x%lx", (void
*)REC->address, (void *)REC->ip, REC->error_code

Again, this looks like exactly what I would expect since address has
the cr2 value in that function. Plus, we know that the raw value is
correct. I suppose that the "symbolification" of that value is done in
trace-cmd, right? So, perhaps that is where I should start looking for
the problem?

I definitely want to follow up on this and help where I can. That
said, I think I am satisfied with "our" (really, your) answer to the
original problem.

Thank you so much!
Will

>
>
>
>>
>> Thanks again for your patience with me! I think that I now understand
>> what is going on.
>
>
> No problem, it's good to be curious.
>
> -- Steve