lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 05 Jun 2015 07:38:59 +0900
From:	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>
To:	Tom Zanussi <tom.zanussi@...ux.intel.com>, rostedt@...dmis.org
CC:	daniel.wagner@...-carit.de, namhyung@...nel.org,
	josh@...htriplett.org, andi@...stfloor.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 00/10] tracing: 'hist' triggers

On 2015/06/04 3:30, Tom Zanussi wrote:
> This is v6 of the 'hist triggers' patchset, following feedback from
> v5.
> 
> Changes from v5:
> 
> This version adds support for compound keys, along with the related
> ability to sort using primary and secondary keys.  This was mentioned
> in previous versions as the last important piece that remained
> unimplemented, and is now implemented.  (I didn't have time to get to
> the couple of enhancements suggested by Masami, but I expect to be
> able to add those later on top of these.)

No problem, I agree that we can add such enhancement afterwards.

> Because we now support compound keys and it's not immediately clear in
> the output exactly which fields correspond to keys, the key(s),
> compound or not, are now enclosed by curly braces.

Yes, this looks very nice :)
Here, I've tested with probes.

----
[root@...alhost tracing]# perf probe __kmalloc caller=\$stack0 size
Added new event:
  probe:__kmalloc      (on __kmalloc with caller=$stack0 size)

You can now use it in all perf tools, such as:

        perf record -e probe:__kmalloc -aR sleep 1

[root@...alhost tracing]# echo 'hist:keys=caller.sym,size:vals=hitcount:sort=size' > events/probe/__kmalloc/trigger
[root@...alhost tracing]# cat events/probe/__kmalloc/hist
# trigger info: hist:keys=caller.sym,size:vals=hitcount:sort=size:size=2048 [active]

{ caller: [ffffffffa020fdb7] drm_mode_dirtyfb_ioctl             , size:          8 } hitcount:         12
{ caller: [ffffffff812966c0] load_elf_binary                    , size:         28 } hitcount:          1
{ caller: [ffffffff8125bde4] alloc_fdmem                        , size:         64 } hitcount:          1
{ caller: [ffffffffa02e0770] __nf_ct_ext_add_length             , size:         96 } hitcount:          1
{ caller: [ffffffff81415eb8] bio_alloc_bioset                   , size:        152 } hitcount:          1
{ caller: [ffffffff81296456] load_elf_phdrs                     , size:        392 } hitcount:          1
{ caller: [ffffffff81296456] load_elf_phdrs                     , size:        504 } hitcount:          1
{ caller: [ffffffff81528f18] __tty_buffer_request_room          , size:       1056 } hitcount:          1
{ caller: [ffffffff8169ca3b] sk_prot_alloc                      , size:       1136 } hitcount:          1
{ caller: [ffffffff8125bde4] alloc_fdmem                        , size:       2048 } hitcount:          1
{ caller: [ffffffff8126275b] seq_buf_alloc                      , size:       4096 } hitcount:          6
{ caller: [ffffffff81196540] tracing_map_sort_entries           , size:      16384 } hitcount:          1

Totals:
    Hits: 28
    Entries: 12
    Dropped: 0
----
BTW, when I pass 'keys' as a sort key, it just ignored and no error is returned.
I think it would better return an error if parsing is failed.

Thank you!



> 
> The Documentation and README have been updated to reflect the changes,
> and several new examples have been added to illustrate how to use
> compound keys.
> 
> Also, the code was updated to work with the new ftrace_event_file,
> etc, renaming in tracing/for-next.
> 
> Changes from v4:
> 
> This version addresses some problems and suggestions made by Daniel
> Wagner - a lot of the code was reworked to get rid of the distinction
> between keys and values, and as a result, both keys and values can be
> used as sort keys.  As suggested, it also allows 'val=' to be absent
> in a trigger command - if no 'val' is specified, hitcount is assumed
> and automatically used as the only val.
> 
> The map code was also separated out into a separate file,
> tracing_map.c, allowing it to be reused.  It also adds a second tracer
> called function_hist that actually does reuse the code, as an RFC
> patch.
> 
> Patch 01/10 [tracing: Update cond flag when enabling or disabling..]
> is a fix for a problem noticed by Daniel and that fixes a problem in
> existing trigger code and should be applied regardless of whether the
> rest of the patchset is merged.
> 
> As mentioned, patch 10/10 is an RFC patch implementing a new tracer
> based on the function tracer code.  It's a fun little tool and is
> useful for a specific problem I'm working on (and is also a nice test
> of the tracing_map code), but is an RFC because first, I'm not sure it
> would really be of general interest and secondly, it's POC-level
> quality and I'd need to spend more time fixing it up to make it
> upstreamable, but I don't want to waste my time if not.
> 
> There are a couple of important bits of functionality that were
> present in v1 but not yet reimplemented in v5.
> 
> The first is support for compound keys.  Currently, maps can only be
> keyed on a single event field, whereas in v1 they could be keyed on
> multiple keys.  With support for compound keys, you can create much
> more interesting output, such as for example per-pid lists of
> syscalls or read counts e.g.:
> 
>   # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \
>         /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
> 
>   # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
> 
>   key: common_pid:bash[3112], id:sys_write		       vals: count:69
>   key: common_pid:bash[3112], id:sys_rt_sigprocmask	       vals: count:218
> 
>   key: common_pid:update-notifier[3164], id:sys_poll	       vals: count:37
>   key: common_pid:update-notifier[3164], id:sys_recvfrom       vals: count:118
> 
>   key: common_pid:deja-dup-monito[3194], id:sys_sendto	       vals: count:1
>   key: common_pid:deja-dup-monito[3194], id:sys_read	       vals: count:4
>   key: common_pid:deja-dup-monito[3194], id:sys_poll	       vals: count:8
>   key: common_pid:deja-dup-monito[3194], id:sys_recvmsg	       vals: count:8
>   key: common_pid:deja-dup-monito[3194], id:sys_getegid	       vals: count:8
> 
>   key: common_pid:emacs[3275], id:sys_fsync		       vals: count:1
>   key: common_pid:emacs[3275], id:sys_open		       vals: count:1
>   key: common_pid:emacs[3275], id:sys_symlink		       vals: count:2
>   key: common_pid:emacs[3275], id:sys_poll		       vals: count:23
>   key: common_pid:emacs[3275], id:sys_select		       vals: count:23
>   key: common_pid:emacs[3275], id:unknown_syscall	       vals: count:34
>   key: common_pid:emacs[3275], id:sys_ioctl		       vals: count:60
>   key: common_pid:emacs[3275], id:sys_rt_sigprocmask	       vals: count:116
> 
>   key: common_pid:cat[3323], id:sys_munmap		       vals: count:1
>   key: common_pid:cat[3323], id:sys_fadvise64		       vals: count:1
> 
> Related to that is support for sorting on multiple fields.  Currently,
> you can sort using only a primary key.  Being able to sort on multiple
> or at least a secondary key is indispensible for seeing trends when
> displaying multiple values.
> 
> Changes from v3:
> 
> v4 fixes the race in tracing_map_insert() noted in v3, where
> map.val.key could be checked even if map.val wasn't yet set.  The
> simple fix for that in tracing_map_insert() introduces the possibility
> of duplicates in the map, which though rare, need to be accounted for
> in the output.  To address that, duplicate-merging code was added to
> the map-printing code.
> 
> It was also pointed out that it didn't seem correct to include
> module.h, but the fix for that has deeper roots and is being addressed
> by a separate patchset; for now we need to continue including
> module.h, though prompted by that I did some other header include
> cleanup.
> 
> The functionality remains the same as v2, but this version no longer
> tries to export and use bpf_maps, and more importantly removes the
> associated GFP_NOTRACE/trace event hacks and kmem macros required to
> work around the bpf_map implementation.
> 
> The tracing_map functionality is instead built on top of a simple
> lock-free map algorithm originated by Dr. Cliff Click (see references
> in the code for more details), which though too restrictive to be
> general-purpose in its current form, functions nicely as a
> special-purpose tracing map.
> 
> v3 also moves the hist triggers code into a separate file and puts it
> all behind a new config option, CONFIG_HIST_TRIGGERS.  It also merges
> in the sorting code rather than keeping it as a separate patch.
> 
> This patchset also includes a couple other new and related triggers,
> enable_hist and disable_hist, very similar to the existing
> enable_event/disable_event triggers used to automatically enable and
> disable events based on a triggering condition, but in this case
> allowing hist triggers to be enabled and disabled in the same way.
> 
>  - Added an insert check for val before checking the key associated with val
>  - Added code to merge possible duplicates in the map
> 
> Changes from v2:
>  - reimplemented tracing_map, replacing bpf_map with nmi-safe/lock-free map
>  - removed GPF_NOTRACE, kmalloc/free macros and event hacks needed by bpf_maps
>  - moved hist triggers from trace_events_trigger.c to trace_events_hist.c
>  - added CONFIG_HIST_TRIGGERS config option
>  - consolidated sorting code with main patch
> 
> Changes from v1:
>  - completely rewritten on top of tracing_map (renamed and exported bpf_map)
>  - added map clearing and client ops to tracing_map
>  - changed the name from 'hash' triggers to 'hist' triggers
>  - added new trigger 'pause' feature
>  - added new enable_hist and disable_hist triggers
>  - added usage for hist/enable_hist/disable hist to tracing/README
>  - moved examples into Documentation/trace/event.txt
>  - added ___GFP_NOTRACE, kmalloc/kfree macros, and conditional kmem tracepoints
> 
> The following changes since commit a497adb45b8691f7e477e711a1a4bd54748d64fe:
> 
>   ring-buffer: Add enum names for the context levels (2015-05-29 10:39:08 -0400)
> 
> are available in the git repository at:
> 
>   git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hist-triggers-v6
>   http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hist-triggers-v6
> 
> Tom Zanussi (10):
>   tracing: Update cond flag when enabling or disabling a trigger
>   tracing: Make ftrace_event_field checking functions available
>   tracing: Add event record param to trigger_ops.func()
>   tracing: Add get_syscall_name()
>   tracing: Add a per-event-trigger 'paused' field
>   trace: Add lock-free tracing_map
>   tracing: Add 'hist' event trigger command
>   tracing: Add enable_hist/disable_hist triggers
>   tracing: Add 'hist' trigger Documentation
>   ftrace: Add function_hist tracer
> 
>  Documentation/trace/events.txt      | 1091 ++++++++++++++++++++++++++++
>  include/linux/trace_events.h        |    9 +-
>  kernel/trace/Kconfig                |   14 +
>  kernel/trace/Makefile               |    3 +
>  kernel/trace/trace.c                |   68 ++
>  kernel/trace/trace.h                |   88 ++-
>  kernel/trace/trace_events.c         |    4 +
>  kernel/trace/trace_events_filter.c  |   12 -
>  kernel/trace/trace_events_hist.c    | 1360 +++++++++++++++++++++++++++++++++++
>  kernel/trace/trace_events_trigger.c |  149 ++--
>  kernel/trace/trace_functions_hist.c |  345 +++++++++
>  kernel/trace/trace_syscalls.c       |   11 +
>  kernel/trace/tracing_map.c          |  920 ++++++++++++++++++++++++
>  kernel/trace/tracing_map.h          |  143 ++++
>  14 files changed, 4133 insertions(+), 84 deletions(-)
>  create mode 100644 kernel/trace/trace_events_hist.c
>  create mode 100644 kernel/trace/trace_functions_hist.c
>  create mode 100644 kernel/trace/tracing_map.c
>  create mode 100644 kernel/trace/tracing_map.h
> 


-- 
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: masami.hiramatsu.pt@...achi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ