linux-kernel - Re: [GIT PULL v2] tracing/kprobes: v1 + two fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090916053017.GD5121@nowhere>
Date:	Wed, 16 Sep 2009 07:30:19 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Li Zefan <lizf@...fujitsu.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Masami Hiramatsu <mhiramat@...hat.com>
Subject: Re: [GIT PULL v2] tracing/kprobes: v1 + two fixes

On Thu, Aug 27, 2009 at 05:26:25PM +0200, Ingo Molnar wrote:
> 
> It would also be nice to have a pie-in-the-sky list of usecases and 
> workflows where this would be useful, and of future planned 
> features. (maybe we want some of them before we merge it upstream)
> 
> Why would the upstream kernel want to have this feature, and what is 
> the road ahead in terms of integration into tooling?
> 
> Thanks,
> 
> 	Ingo

In term, it would have the same skyline than static tracepoints events.
It already has actually, it supports filters, perf, etc...

For now one have yet to create these tracepoints through debugfs.

So what does it bring us?

First of all, the ability to profile the kernel at every random points.

2) It can be useful as a single counter

Say you want to trace:

long sys_kill(int pid, int sig)

(I know it's a bad example, we already have syscalls tracepoints, it's just
for the example).

And you want to see who is calling most this function. You could probably just
do:

	sudo perf record -f -a -g
	./perf report

And look at the result by looking at your function in the list, then
look at its callchain.

Of course the timer could give you the overhead of send_signal,
but:

- at the cost of profiling the whole system
- putting a kprobe there plus -c 1 on record would give you more
  accurate results, you won't loose any callchains

2) It can be useful as a tracepoint

Now you have your profile, and you want to know more about it.
You may want to know which signal and which task are often concerned
in this function call.

So you can fetch the pid and sig arguments, you can also set pid=a0
and sig=a1 in the kprobes debugfs interface, so that the format
takes these names intead of the raw a0,a1.

If you want a high level of details, you can just do

	perf trace

And look at the result.

	sig_kill: (common headers), pid=... sig=...

That, in essence, is a live patching trace_printk(),
something that I personally miss every day.

Also in my perf trace TODO list is the ability to implement a
sorting by fields:

        ./perf trace -s pid

        pid = 4765
         |
         |
         ------------ sig_kill: .... pid = 4765, sig = 7
         |
         ------------ sig_kill: .... pid = 4765, sig = 10
         |
         ------------ etc...

        pid = 7645
         |
         |
         ------------ etc...

In my perf trace TODO list is also the ability to get the callchains:

	./perf trace -s pid -g

        pid = 4765
         |
         |
         ------------ sig_kill: .... pid = 4765, sig = 7
         |              |
         |              |
         |              -------- caller 1
         |              |
         |              -------- caller 2
         |              |
         |              -------- caller 3
         |              |
         |              --------- .....
         |
         ------------ .......

3) It can find *much* more sunchine with C-expressions

I've used kprobes events through debugfs for debugging purposes.
If you just want to fetch the arguments of a function or global
variables, it's fine and easy to use.
But once you want to digg and diplay some local, variables,
it takes too much time and pain (find in which ip you can fetch
which register which matches which variable you want).

As you know, Masami has posted a translator from C-like level
expressions to kprobes debugfs command line using libdwarf.

One of the plans is to make a perf integration of this tool
so that one can fetch values from variables names (global and local)
and set such smart dynamic tracepoints everywhere in the kernel
(if it's not __kprobe annotated).

Concerning the possible syntax and workflow of this tool,
it's in daily open debate :)

	Frederic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/