linux-kernel - Re: [RFC] Full syscall argument decode in "perf trace"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130918143346.GB3891@infradead.org>
Date:	Wed, 18 Sep 2013 11:33:46 -0300
From:	Arnaldo Carvalho de Melo <acme@...hat.com>
To:	Denys Vlasenko <dvlasenk@...hat.com>
Cc:	Tom Zanussi <tzanussi@...il.com>,
	Steven Rostedt <srostedt@...hat.com>,
	Ingo Molnar <mingo@...e.hu>, Jiri Olsa <jolsa@...hat.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	Oleg Nesterov <oleg@...hat.com>, linux-kernel@...r.kernel.org,
	Denys Vlasenko <vda.linux@...glemail.com>
Subject: Re: [RFC] Full syscall argument decode in "perf trace"

Em Wed, Sep 18, 2013 at 01:35:13PM +0200, Denys Vlasenko escreveu:
> On 09/17/2013 09:06 PM, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Sep 17, 2013 at 05:10:55PM +0200, Denys Vlasenko escreveu:
> >> I'm trying to figure out how to extend "perf trace".
> >  
> >> Currently, it shows syscall names and arguments, and only them.
> >> Meaning that syscalls such as open(2) are shown as:
> >  
> >>     open(filename: 140736118412184, flags: 0, mode: 140736118403776) = 3
> >  
> >> The problem is, of course, that user wants to see the filename
> >> per se, not the address of its first byte.
> >  
> >> To improve that, we need to fetch the pointed-to data.
> >> There are two approaches to this: extending
> >> "raw_syscalls:sys_{enter,exit}" tracepoint so that it returns this data,
> >> or selectively stopping the traced process when it reaches the thacepoint.
> > 
> > We don't want to stop the process at all, this is one of the major
> > advantages of 'perf trace' over 'strace'.
> 
> This is a worthy goal. strace is so slow exactly because it stops
> traced process so often. strace developers do want to avoid
> as many of these stops as possible.
> 
> I'm not sure that "not stopping ever" is achievable, though.
> There are cases where stopping is necessary.

Can't we try first to achieve what is possible with existing
infrastructure so that we can, out of the combo 'perf trace' and
'strace' have something that is better than plain 'strace'?

> For example, after clone() call, depending on the tracer needs,
> there may be operations which must be done on the new child
> before it is allowed to run.
> 
> strace used to use hideous, unsafe workarounds to catch children,
> until ptrace was augmented with features which made children stop
> immediately.
> 
> Do you think you can work around that? I just don't see how.

I haven't even thought about it 8-)
 
> > Look at the tmp.perf/trace2 branch in my git repo, tglx and Ingo added a
> > tracepoint to vfs_getname to use that.
> 
> I know that this is the way how to fetch syscall args without stopping,
> yes.
> 
> The problem: ~100 more tracepoints need to be added merely to get
> to the point where strace already is, wrt quality of syscall decoding.
> strace has nearly 300 separate custom syscall formatting functions,
> some of them quite complex.
> 
> If we need to add syscall stopping feature (which, as I said above,
> will be necessary anyway IMO), then syscall decoding can be as good
> as strace *already*. Then, gradually more tracepoints are added
> to make it faster.
> 
> I am thinking about going into this direction.
> 
> Therefore my question should be restated as:
> 
> Would perf developers accept the "syscall pausing" feature,
> or it won't be accepted?

Do you have some patch for us to try?

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/