lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBR7-JreB8c3Y1rNGpqdjeSN71qUkPrMxV-wjOSaTEx+vQ@mail.gmail.com>
Date:	Thu, 29 Mar 2012 10:04:30 -0700
From:	Stephane Eranian <eranian@...gle.com>
To:	Jiri Olsa <jolsa@...hat.com>
Cc:	acme@...hat.com, a.p.zijlstra@...llo.nl, mingo@...e.hu,
	paulus@...ba.org, cjashfor@...ux.vnet.ibm.com, fweisbec@...il.com,
	gorcunov@...nvz.org, tzanussi@...il.com, mhiramat@...hat.com,
	rostedt@...dmis.org, robert.richter@....com, fche@...hat.com,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC 00/15] perf: Add backtrace post dwarf unwind

On Wed, Mar 28, 2012 at 5:35 AM, Jiri Olsa <jolsa@...hat.com> wrote:
> hi,
> sending RFC version of the post unwinding user stack backtrace
> using dwarf unwind - via libunwind. The original work was
> done by Frederic. I mostly took his patches and make them
> compile in current kernel code plus I added some stuff here
> and there.
>
> The main idea is to store user registers and portion of user
> stack when the sample data during the record phase. Then during
> the report, when the data is presented, perform the actual dwarf
> dwarf unwind.
>
Although I understand why you need this for user level
dwarf unwinding, I think you also need to look at the more general
problem of capturing the machine state registers at the interrupted
IP as well. There are interesting measurements one can make with
those, such as sampling of function arguments.

I think the mechanism should allow the user to select which registers
(you have that) but also where they are captured. You have
the user level state, but you also want the interrupted state or the
precise state, i.e., extracting the register at retirement of an instruction
that caused the sampling PMU event (PEBS on Intel). Personally, I
am interested in the last two. I had a prototype patch for those.
It is based on the same approach in terms of register naming. You
need to be able to name individual registers. That's obviously arch
specific and you have that. Now there needs to be a way to indicate
where the registers must to be captured. Note that you may want
to combine user + interrupt states. So I think we may need multiple
register bitmasks.

I am aware of the security issues with smapling machine state registers
at the kernel level but they can be restricted, just like system-wide sessions.



> attached patches:
>  01/15 perf, tool: Fix the array pointer to follow event data properly
>  02/15 uaccess: Add new copy_from_user_gup API
>  03/15 perf: Unified API to record selective sets of arch registers
>  04/15 perf: Add ability to dump user regs
>  05/15 perf: Add ability to dump part of the user stack
>  06/15 perf: Add attribute to filter out user callchains
>  07/15 perf, tool: Factor DSO symtab types to generic binary types
>  08/15 perf, tool: Add interface to read DSO image data
>  09/15 perf, tool: Add '.note' check into search for NOTE section
>  10/15 perf, tool: Back [vdso] DSO with real data
>  11/15 perf, tool: Add interface to arch registers sets
>  12/15 perf, tool: Add libunwind dependency for dwarf cfi unwinding
>  13/15 perf, tool: Support user regs and stack in sample parsing
>  14/15 perf, tool: Support for dwarf cfi unwinding on post processing
>  15/15 perf, tool: Support for dwarf mode callchain on perf record
>
> The unwind processing could considerably prolong the computing
> time of the report command, but I believe this could be improved.
>   - caching DSO data accesses (as suggested in patch 8/15)
>   - maybe separate thread with unwind processing on background,
>     so the user does no need to wait for all the data to be
>     processed.
>
> I tested on Fedora. There was not much gain on i386, because the
> binaries are compiled with frame pointers. Thought the dwarf
> backtrace is more accurade and unwraps calls in more details
> (functions that do not set the frame pointers).
>
> I could see some improvement on x86_64, where I got full backtrace
> where current code could got just the first address out of the
> instruction pointer.
>
> Example on x86_64:
> [dwarf]
>   perf record -g -e syscalls:sys_enter_write date
>
>   100.00%     date  libc-2.14.90.so  [.] __GI___libc_write
>               |
>               --- __GI___libc_write
>                   _IO_file_write@@GLIBC_2.2.5
>                   new_do_write
>                   _IO_do_write@@GLIBC_2.2.5
>                   _IO_file_overflow@@GLIBC_2.2.5
>                   0x4022cd
>                   0x401ee6
>                   __libc_start_main
>                   0x4020b9
>
>
> [frame pointer]
>   perf record -g fp -e syscalls:sys_enter_write date
>
>   100.00%     date  libc-2.14.90.so  [.] __GI___libc_write
>               |
>               --- __GI___libc_write
>
> Also I tested on coreutils binaries mainly, but I could see
> getting wider backtraces with dwarf unwind for more complex
> application like firefox.
>
> The unwind should go throught [vdso] object. I haven't studied
> the [vsyscall] yet, so not sure there.
>
> Attached patches should work on both x86 and x86_64. I did
> some initial testing so far.
>
> The unwind backtrace can be interrupted by following reasons:
>    - bug in unwind information of processed shared library
>    - bug in unwind processing code (most likely ;) )
>    - insufficient dump stack size
>    - wrong register value - x86_64 does not store whole
>      set of registers when in exception, but so far
>      it looks like RIP and RSP should be enough
>
> I'd like to have some automated tests on this, but so far nothing
> smart is comming to me.. ;)
>
> thanks for comments,
> jirka
> ---
>  arch/Kconfig                                       |    7 +
>  arch/x86/Kconfig                                   |    1 +
>  arch/x86/include/asm/perf_regs.h                   |   15 +
>  arch/x86/include/asm/perf_regs_32.h                |   86 +++
>  arch/x86/include/asm/perf_regs_64.h                |  101 ++++
>  arch/x86/include/asm/uaccess.h                     |    8 +-
>  arch/x86/kernel/cpu/perf_event.c                   |    4 +-
>  arch/x86/kernel/cpu/perf_event_intel_ds.c          |    3 +-
>  arch/x86/kernel/cpu/perf_event_intel_lbr.c         |    2 +-
>  arch/x86/lib/usercopy.c                            |    4 +-
>  arch/x86/oprofile/backtrace.c                      |    4 +-
>  include/asm-generic/uaccess.h                      |    4 +
>  include/linux/perf_event.h                         |   36 ++-
>  kernel/events/callchain.c                          |    4 +-
>  kernel/events/core.c                               |  127 +++++-
>  kernel/events/internal.h                           |   59 ++-
>  kernel/events/ring_buffer.c                        |    4 +-
>  tools/perf/Makefile                                |   40 ++-
>  tools/perf/arch/x86/Makefile                       |    3 +
>  tools/perf/arch/x86/include/perf_regs.h            |  101 ++++
>  tools/perf/arch/x86/util/unwind.c                  |  111 ++++
>  tools/perf/builtin-record.c                        |   89 +++-
>  tools/perf/builtin-report.c                        |   24 +-
>  tools/perf/builtin-script.c                        |   56 ++-
>  tools/perf/builtin-test.c                          |    3 +-
>  tools/perf/builtin-top.c                           |    7 +-
>  tools/perf/config/feature-tests.mak                |   25 +
>  tools/perf/perf.h                                  |    9 +-
>  tools/perf/util/annotate.c                         |    2 +-
>  tools/perf/util/event.h                            |   15 +-
>  tools/perf/util/evlist.c                           |   16 +
>  tools/perf/util/evlist.h                           |    2 +
>  tools/perf/util/evsel.c                            |   36 ++-
>  tools/perf/util/include/linux/compiler.h           |    1 +
>  tools/perf/util/map.c                              |   16 +-
>  tools/perf/util/map.h                              |    7 +-
>  tools/perf/util/perf_regs.h                        |   10 +
>  tools/perf/util/python.c                           |    3 +-
>  .../perf/util/scripting-engines/trace-event-perl.c |    3 +-
>  .../util/scripting-engines/trace-event-python.c    |    3 +-
>  tools/perf/util/session.c                          |  100 +++-
>  tools/perf/util/session.h                          |   10 +-
>  tools/perf/util/symbol.c                           |  317 +++++++++---
>  tools/perf/util/symbol.h                           |   40 +-
>  tools/perf/util/trace-event-scripting.c            |    3 +-
>  tools/perf/util/trace-event.h                      |    5 +-
>  tools/perf/util/unwind.c                           |  563 ++++++++++++++++++++
>  tools/perf/util/unwind.h                           |   34 ++
>  tools/perf/util/vdso.c                             |   92 ++++
>  tools/perf/util/vdso.h                             |    7 +
>  50 files changed, 2023 insertions(+), 199 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ