linux-kernel - Re: [PATCH v7 00/11] tracing: Add fprobe events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230428094006.49275a20ab80e0142a6c43c0@kernel.org>
Date:   Fri, 28 Apr 2023 09:40:06 +0900
From:   Masami Hiramatsu (Google) <mhiramat@...nel.org>
To:     Jiri Olsa <olsajiri@...il.com>
Cc:     linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Steven Rostedt <rostedt@...dmis.org>,
        Florent Revest <revest@...omium.org>,
        Mark Rutland <mark.rutland@....com>,
        Will Deacon <will@...nel.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Martin KaFai Lau <martin.lau@...ux.dev>, bpf@...r.kernel.org
Subject: Re: [PATCH v7 00/11] tracing: Add fprobe events

On Thu, 27 Apr 2023 13:58:56 +0200
Jiri Olsa <olsajiri@...il.com> wrote:

> On Thu, Apr 27, 2023 at 10:17:45AM +0900, Masami Hiramatsu (Google) wrote:
> > Hi,
> > 
> > Here is the 7th version of improve fprobe and add a basic fprobe event
> > support for ftrace (tracefs) and perf. Here is the previous version.
> > 
> > https://lore.kernel.org/all/168234755610.2210510.12133559313738141202.stgit@mhiramat.roam.corp.google.com/
> > 
> > This version is rebased on the latest linux-trace/for-next, fixes
> > bpf_get_btf_vmlinux() return value check [6/11] and adds new BTF $retval
> > type support [9/11] (I forgot to implement this feature last time).
> > Also updates according to the BTF $retval type support.
> > 
> > You can also get this series from:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext
> > 
> > With this fprobe events, we can continue to trace function entry/exit
> > even if the CONFIG_KPROBES_ON_FTRACE is not available. Since
> > CONFIG_KPROBES_ON_FTRACE requires the CONFIG_DYNAMIC_FTRACE_WITH_REGS,
> > it is not available if the architecture only supports
> > CONFIG_DYNAMIC_FTRACE_WITH_ARGS (e.g. arm64). And that means kprobe
> > events can not probe function entry/exit effectively on such architecture.
> > But this problem can be solved if the dynamic events supports fprobe events
> > because fprobe events doesn't use kprobe but ftrace via fprobe.
> > 
> > FPROBE EVENTS
> > =============
> > 
> > Fprobe events allows user to add new events on the entry and exit of kernel
> > functions (which can be ftraced). Unlike kprobe events, the fprobe events
> > can only probe the function entry and exit, and it can only trace the
> > function args, return value, and stacks. (no registers)
> > For probing function body, users can continue to use the kprobe events.
> > 
> > The tracepoint probe events (tprobe events) also allows user to add new
> > events dynamically on the tracepoint. Most of the tracepoint already has
> > trace-events, so this feature is useful if you only want to know a
> > specific parameter, or trace the tracepoints which has no trace-events
> > (e.g. sched_*_tp tracepoints only exposes the tracepoints.)
> > 
> > The fprobe events syntax is;
> > 
> >  f[:[GRP/][EVENT]] FUNCTION [FETCHARGS]
> >  f[MAXACTIVE][:[GRP/][EVENT]] FUNCTION%return [FETCHARGS]
> > 
> > And tracepoint probe events syntax is;
> > 
> >  t[:[GRP/][EVENT]] TRACEPOINT [FETCHARGS]
> > 
> > This series includes BTF argument support for fprobe/tracepoint events,
> > and kprobe events. This allows us to fetch a specific function parameter
> > by name, and all parameters by '$$args'.
> 
> are you planning to fetch and display more complicated types in future?
> like strings or dereferencing struct field from argument pointer

Yes, that's on my next TODO list. The string thing is a bit problematic,
but I think in most case, "const char *" is used for the strings.

> 
> > Note that enabling this feature, you need to enable CONFIG_BPF_SYSCALL and
> > confirm that your arch supports CONFIG_HAVE_FUNCTION_ARG_ACCESS_API.
> > 
> > E.g.
> > 
> >  # echo 't kfree ptr' >> dynamic_events
> >  # echo 'f kfree object' >> dynamic_events
> >  # cat dynamic_events 
> > t:tracepoints/kfree kfree ptr=ptr
> > f:fprobes/kfree__entry kfree object=object
> >  # echo 1 > events/fprobes/enable
> >  # echo 1 > events/tracepoints/enable
> >  # echo > trace
> >  # head -n 20 trace | tail
> > #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
> > #              | |         |   |||||     |         |
> >             tail-84      [000] .....  1324.561958: kfree__entry: (kfree+0x4/0x140) object=0xffff888006383c00
> >             tail-84      [000] ...1.  1324.561961: kfree: (__probestub_kfree+0x4/0x10) ptr=0xffff888006383c00
> >             tail-84      [000] .....  1324.561988: kfree__entry: (kfree+0x4/0x140) object=0x0
> >             tail-84      [000] ...1.  1324.561988: kfree: (__probestub_kfree+0x4/0x10) ptr=0x0
> >             tail-84      [000] .....  1324.561989: kfree__entry: (kfree+0x4/0x140) object=0xffff88800671e600
> >             tail-84      [000] ...1.  1324.561989: kfree: (__probestub_kfree+0x4/0x10) ptr=0xffff88800671e600
> >             tail-84      [000] .....  1324.562368: kfree__entry: (kfree+0x4/0x140) object=0xffff8880065e0580
> >             tail-84      [000] ...1.  1324.562369: kfree: (__probestub_kfree+0x4/0x10) ptr=0xffff8880065e0580
> 
> I checked with perf and record/stat/script seem to work fine with this
> 
>   # ./perf record -e 'fprobes:myprobe'
>   ^C[ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.162 MB perf.data (1 samples) ]
> 
>   # ./perf script
>     systemd-oomd   479 [001] 14550.722079: fprobes:myprobe: (ffffffff81505be0) filename=0x557b033662b0

Thanks for testing!

> 
> perf trace seems to be off with __probe_ip for some reason:
> 
>   # ./perf trace -e 'fprobes:myprobe'
>      0.000 systemd-oomd/479 fprobes:myprobe(__probe_ip: -2125440032, filename: 93986839069680)
>      1.189 systemd-oomd/479 fprobes:myprobe(__probe_ip: -2125440032, filename: 93986839070144)
> 
> but it's probably perf issue

Yeah, it seems that the perf trace handles __probe_ip as signed long.
Does that happen with kprobe events too?

Thank you,
> 
> thanks,
> jirka


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>