lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090716155652.6266.39970.stgit@localhost.localdomain>
Date:	Thu, 16 Jul 2009 11:56:52 -0400
From:	Masami Hiramatsu <mhiramat@...hat.com>
To:	Ingo Molnar <mingo@...e.hu>, Steven Rostedt <rostedt@...dmis.org>,
	lkml <linux-kernel@...r.kernel.org>
Cc:	Avi Kivity <avi@...hat.com>, "H. Peter Anvin" <hpa@...or.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andi Kleen <andi@...stfloor.org>,
	Jim Keniston <jkenisto@...ibm.com>,
	"K.Prasad" <prasad@...ux.vnet.ibm.com>,
	PrzemysławPawełczyk 
	<przemyslaw@...elczyk.it>, Vegard Nossum <vegard.nossum@...il.com>,
	Christoph Hellwig <hch@...radead.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	Tom Zanussi <tzanussi@...il.com>,
	systemtap <systemtap@...rces.redhat.com>,
	kvm <kvm@...r.kernel.org>,
	DLE <dle-develop@...ts.sourceforge.net>
Subject: [PATCH -tip -v12 00/11] tracing: kprobe-based event tracer and x86
	instruction decoder

Hi,

Here are the v12 patches. I updated it for the latest -tip and
add fix some bugs.

Here are the patches of kprobe-based event tracer for x86, version 12,
which allows you to probe various kernel events through ftrace interface.
The tracer supports per-probe filtering which allows you to set filters
on each probe and shows formats of each probe. I think this is more
generic integration with ftrace, especially event-tracer.

This version includes below small fixes.
 - Fix a buffer overflow bug. (PATCH 8/11)
 - Fix indirect memory access string bug. (PATCH 8/11)
 - Remove subsystem event directory if it is empty. (PATCH 6/11)
 - Cleanup code an remove redundant checks. (PATCH 8/11, 11/11)

This patchset also includes x86(-64) instruction decoder which
supports non-SSE/FP opcodes and includes x86 opcode map. The decoder
is used for finding the instruction boundaries when inserting new
kprobes. I think it will be possible to share this opcode map
with KVM's decoder.
The decoder is tested when building kernel, the test compares the 
results of objdump and the decoder right after building vmlinux.
You can enable that test by CONFIG_X86_DECODER_SELFTEST=y.

This series can be applied on the latest linux-2.6.31-rc3-tip.

This supports only x86(-32/-64) (but porting it on other arch
just needs kprobes/kretprobes and register and stack access APIs).


Enhancement ideas will be added after merging:
- Make a stress test of kprobes on this tracer.
  (see http://sources.redhat.com/ml/systemtap/2009-q2/msg01055.html)
- Easy probe setting wrapper which analyzes dwarf..
- .init function tracing support.
- Support primitive types(long, ulong, int, uint, etc) for args.


Kprobe-based Event Tracer
=========================

Overview
--------
This tracer is similar to the events tracer which is based on Tracepoint
infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
and kretprobe). It probes anywhere where kprobes can probe(this means, all
functions body except for __kprobes functions).

Unlike the function tracer, this tracer can probe instructions inside of
kernel functions. It allows you to check which instruction has been executed.

Unlike the Tracepoint based events tracer, this tracer can add new probe points
on the fly.

Similar to the events tracer, this tracer doesn't need to be activated via
current_tracer, instead of that, just set probe points via
/sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
probe events via /sys/kernel/debug/tracing/events/kprobes/<EVENT>/filter.


Synopsis of kprobe_events
-------------------------
  p[:EVENT] SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS]	: Set a probe
  r[:EVENT] SYMBOL[+0] [FETCHARGS]			: Set a return probe

 EVENT			: Event name. If omitted, the event name is generated
			  based on SYMBOL+offs or MEMADDR.
 SYMBOL[+offs|-offs]	: Symbol+offset where the probe is inserted.
 MEMADDR		: Address where the probe is inserted.

 FETCHARGS		: Arguments. Each probe can have up to 128 args.
  %REG	: Fetch register REG
  sN	: Fetch Nth entry of stack (N >= 0)
  @ADDR	: Fetch memory at ADDR (ADDR should be in kernel)
  @SYM[+|-offs]	: Fetch memory at SYM +|- offs (SYM should be a data symbol)
  aN	: Fetch function argument. (N >= 0)(*)
  rv	: Fetch return value.(**)
  ra	: Fetch return address.(**)
  +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***)

  (*) aN may not correct on asmlinkaged functions and at the middle of
      function body.
  (**) only for return probe.
  (***) this is useful for fetching a field of data structures.


Per-Probe Event Filtering
-------------------------
 Per-probe event filtering feature allows you to set different filter on each
probe and gives you what arguments will be shown in trace buffer. If an event
name is specified right after 'p:' or 'r:' in kprobe_events, the tracer adds
an event under tracing/events/kprobes/<EVENT>, at the directory you can see
'id', 'enabled', 'format' and 'filter'.

enabled:
  You can enable/disable the probe by writing 1 or 0 on it.

format:
  It shows the format of this probe event. It also shows aliases of arguments
 which you specified to kprobe_events.

filter:
  You can write filtering rules of this event. And you can use both of aliase
 names and field names for describing filters.


Event Profiling
---------------
 You can check the total number of probe hits and probe miss-hits via
/sys/kernel/debug/tracing/kprobe_profile.
 The first column is event name, the second is the number of probe hits,
the third is the number of probe miss-hits.


Usage examples
--------------
To add a probe as a new event, write a new definition to kprobe_events
as below.

  echo p:myprobe do_sys_open a0 a1 a2 a3 > /sys/kernel/debug/tracing/kprobe_events

 This sets a kprobe on the top of do_sys_open() function with recording
1st to 4th arguments as "myprobe" event.

  echo r:myretprobe do_sys_open rv ra >> /sys/kernel/debug/tracing/kprobe_events

 This sets a kretprobe on the return point of do_sys_open() function with
recording return value and return address as "myretprobe" event.
 You can see the format of these events via
/sys/kernel/debug/tracing/events/kprobes/<EVENT>/format.

  cat /sys/kernel/debug/tracing/events/kprobes/myprobe/format
name: myprobe
ID: 23
format:
	field:unsigned short common_type;	offset:0;	size:2;
	field:unsigned char common_flags;	offset:2;	size:1;
	field:unsigned char common_preempt_count;	offset:3;	size:1;
	field:int common_pid;	offset:4;	size:4;
	field:int common_tgid;	offset:8;	size:4;

	field: unsigned long ip;	offset:16;tsize:8;
	field: int nargs;	offset:24;tsize:4;
	field: unsigned long arg0;	offset:32;tsize:8;
	field: unsigned long arg1;	offset:40;tsize:8;
	field: unsigned long arg2;	offset:48;tsize:8;
	field: unsigned long arg3;	offset:56;tsize:8;

	alias: a0;	original: arg0;
	alias: a1;	original: arg1;
	alias: a2;	original: arg2;
	alias: a3;	original: arg3;

print fmt: "%lx: 0x%lx 0x%lx 0x%lx 0x%lx", ip, arg0, arg1, arg2, arg3


 You can see that the event has 4 arguments and alias expressions
corresponding to it.

  echo > /sys/kernel/debug/tracing/kprobe_events

 This clears all probe points. and you can see the traced information via
/sys/kernel/debug/tracing/trace.

  cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
           <...>-1447  [001] 1038282.286875: do_sys_open+0x0/0xd6: 0x3 0x7fffd1ec4440 0x8000 0x0
           <...>-1447  [001] 1038282.286878: sys_openat+0xc/0xe <- do_sys_open: 0xfffffffffffffffe 0xffffffff81367a3a
           <...>-1447  [001] 1038282.286885: do_sys_open+0x0/0xd6: 0xffffff9c 0x40413c 0x8000 0x1b6
           <...>-1447  [001] 1038282.286915: sys_open+0x1b/0x1d <- do_sys_open: 0x3 0xffffffff81367a3a
           <...>-1447  [001] 1038282.286969: do_sys_open+0x0/0xd6: 0xffffff9c 0x4041c6 0x98800 0x10
           <...>-1447  [001] 1038282.286976: sys_open+0x1b/0x1d <- do_sys_open: 0x3 0xffffffff81367a3a


 Each line shows when the kernel hits a probe, and <- SYMBOL means kernel
returns from SYMBOL(e.g. "sys_open+0x1b/0x1d <- do_sys_open" means kernel
returns from do_sys_open to sys_open+0x1b).


Thank you,

---

Masami Hiramatsu (11):
      tracing: Add kprobes event profiling interface
      tracing: Generate names for each kprobe event automatically
      tracing: Kprobe-tracer supports more than 6 arguments
      tracing: add kprobe-based event tracer
      tracing: Introduce TRACE_FIELD_ZERO() macro
      tracing: ftrace dynamic ftrace_event_call support
      x86: add pt_regs register and stack access APIs
      kprobes: cleanup fix_riprel() using insn decoder on x86
      kprobes: checks probe address is instruction boudary on x86
      x86: x86 instruction decoder build-time selftest
      x86: instruction decoder API


 Documentation/trace/kprobetrace.txt    |  147 ++++
 arch/x86/Kconfig.debug                 |    9 
 arch/x86/Makefile                      |    3 
 arch/x86/include/asm/inat.h            |  127 +++
 arch/x86/include/asm/insn.h            |  136 +++
 arch/x86/include/asm/ptrace.h          |   62 ++
 arch/x86/kernel/kprobes.c              |  197 ++---
 arch/x86/kernel/ptrace.c               |  112 +++
 arch/x86/lib/Makefile                  |   13 
 arch/x86/lib/inat.c                    |   82 ++
 arch/x86/lib/insn.c                    |  473 ++++++++++++
 arch/x86/lib/x86-opcode-map.txt        |  711 ++++++++++++++++++
 arch/x86/scripts/Makefile              |   19 
 arch/x86/scripts/distill.awk           |   42 +
 arch/x86/scripts/gen-insn-attr-x86.awk |  314 ++++++++
 arch/x86/scripts/test_get_len.c        |   99 +++
 arch/x86/scripts/user_include.h        |   49 +
 include/linux/ftrace_event.h           |   13 
 include/trace/ftrace.h                 |   22 -
 kernel/trace/Kconfig                   |   12 
 kernel/trace/Makefile                  |    1 
 kernel/trace/trace.h                   |   29 +
 kernel/trace/trace_event_types.h       |    4 
 kernel/trace/trace_events.c            |   72 +-
 kernel/trace/trace_export.c            |   43 +
 kernel/trace/trace_kprobe.c            | 1245 ++++++++++++++++++++++++++++++++
 26 files changed, 3873 insertions(+), 163 deletions(-)
 create mode 100644 Documentation/trace/kprobetrace.txt
 create mode 100644 arch/x86/include/asm/inat.h
 create mode 100644 arch/x86/include/asm/insn.h
 create mode 100644 arch/x86/lib/inat.c
 create mode 100644 arch/x86/lib/insn.c
 create mode 100644 arch/x86/lib/x86-opcode-map.txt
 create mode 100644 arch/x86/scripts/Makefile
 create mode 100644 arch/x86/scripts/distill.awk
 create mode 100644 arch/x86/scripts/gen-insn-attr-x86.awk
 create mode 100644 arch/x86/scripts/test_get_len.c
 create mode 100644 arch/x86/scripts/user_include.h
 create mode 100644 kernel/trace/trace_kprobe.c

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@...hat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ