linux-kernel - Re: [PATCH] Identify which executable object the userspace address belongs to. Store thread group leader id, and use it to lookup the address in the process's map. We could have looked up the address on thread's map, but the thread might not exist by the time we are called. The process might not exist either, but if you are reading trace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <490F035A.5070209@gmail.com>
Date:	Mon, 03 Nov 2008 15:57:46 +0200
From:	Török Edwin <edwintorok@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Robert Richter <robert.richter@....com>,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	srostedt@...hat.com, a.p.zijlstra@...llo.nl, sandmann@...mi.au.dk,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Identify which executable object the userspace address
 belongs to. Store thread group leader id, and use it to lookup the address
 in the process's map. We could have looked up the address on thread's map,
 but the thread might not exist by the time we are called. The process might
 not exist either, but if you are reading trace_pipe, that is unlikely.

On 2008-11-03 10:29, Ingo Molnar wrote:
> * Török Edwin <edwintorok@...il.com> wrote:
>
>   
>>> Your patches are a nice feature we want to have nevertheless - to 
>>> be able to see where a user-space app is running has been one of 
>>> the historically weak points of kernel instrumentation.
>>>       
>> Thanks.
>> It currently works for x86 only, but architecture porters can add
>> support for theirs quite easily, it just needs to modeled after how
>> oprofile does it for example.
>> BTW would it make sense to change oprofile and the sysprof tracer to use
>> save_stack_trace_user? It would eliminate some code duplication.
>>     
>
> that definitely sounds like the right direction. I've Cc:-ed Robert 
> Richter, the Oprofile maintainer - please Cc: him to code that touches 
> oprofile.
>
> note that NMI interaction of user-space stackframe walkers can be a 
> bit tricky: the basic problem is that if you fetch a user-space 
> stackframe that can create a fault

The code in trace_sysprof.c (which I used as a base for the
save_stack_trace_user) disables pagefaults
before reading the stackframe from userspace. Does it avoid this problem
then?

Note that due to its use from ftrace, the userstack walker can be called
from the pagefault handler itself, and if it is
allowed to fault it could lead to some form of deadlock. Are the ftrace
functions protected from recursively reentering themselves?

> , and the IRET at the end of the 
> fault handler will re-enable NMIs (violating the NMI code's 
> assumptions).
>   

Is this already a problem with oprofile's user-stack walker?

> there are patches on lkml written by Mathieu Desnoyers that solve this 
> by changing all the fault path to use RET instead of IRET. It might 
> make sense to dust them off - we carried them for a long time in -tip 
> and they were robust. (they just never had any really strong 
> justification and were rather complex - that changes now)
>
> Mathieu, what do you think?
>
>   
>> Would it make sense to add a script that post-processes the output 
>> to scripts/tracing?
>>
>> It would parse a trace log (from trace or latency_trace) and use 
>> addr2line to resolve the address to source:line, and if successful 
>> replace the relative address with that; and also group identical 
>> stack traces together.
>>     
>
> sure, please add it to scripts/tracing/.
>   

Ok, will do so in v3.

> The best approach would be if the kernel could output the best info by 
> default

The kernel could do some grouping and counting (as latencytop does), but
I don't see where it would fit in frace's infrastructure.

I think ftrace's one entry per event is useful in many situations
(debugging, latency measurements), but if the events occur too frequently
it could produce too much data, and it would be more efficient to do
some counting/grouping of similar info in-kernel before outputting to
userspace.
Perhaps as a layer on top of ftrace? What do you think?

>  - but that seems rather hard for addr2line functionality which 
> involves debuginfo processing, etc.
>   

yes it would be an overkill to try to do that from the kernel, when it
is so easy to do from userspace ;)

Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/