lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Jun 2021 13:29:45 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Jiri Olsa <jolsa@...nel.org>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andriin@...com>,
        "Steven Rostedt (VMware)" <rostedt@...dmis.org>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...omium.org>, Daniel Xu <dxu@...uu.xyz>,
        Viktor Malik <vmalik@...hat.com>
Subject: Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for
 direct/tracing attach

On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@...nel.org> wrote:
>
> hi,
> saga continues.. ;-) previous post is in here [1]
>
> After another discussion with Steven, he mentioned that if we fix
> the ftrace graph problem with direct functions, he'd be open to
> add batch interface for direct ftrace functions.
>
> He already had prove of concept fix for that, which I took and broke
> up into several changes. I added the ftrace direct batch interface
> and bpf new interface on top of that.
>
> It's not so many patches after all, so I thought having them all
> together will help the review, because they are all connected.
> However I can break this up into separate patchsets if necessary.
>
> This patchset contains:
>
>   1) patches (1-4) that fix the ftrace graph tracing over the function
>      with direct trampolines attached
>   2) patches (5-8) that add batch interface for ftrace direct function
>      register/unregister/modify
>   3) patches (9-19) that add support to attach BPF program to multiple
>      functions
>
> In nutshell:
>
> Ad 1) moves the graph tracing setup before the direct trampoline
> prepares the stack, so they don't clash
>
> Ad 2) uses ftrace_ops interface to register direct function with
> all functions in ftrace_ops filter.
>
> Ad 3) creates special program and trampoline type to allow attachment
> of multiple functions to single program.
>
> There're more detailed desriptions in related changelogs.
>
> I have working bpftrace multi attachment code on top this. I briefly
> checked retsnoop and I think it could use the new API as well.

Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
The ugly code is at [0] if you'd like to see what kind of changes I
needed to make to use this (it won't work if you check it out because
it needs your libbpf changes synced into submodule, which I only did
locally). But here are some learnings from that experiment both to
emphasize how important it is to make this work and how restrictive
are some of the current limitations.

First, good news. Using this mass-attach API to attach to almost 1000
kernel functions goes from

Plain fentry/fexit:
===================
real    0m27.321s
user    0m0.352s
sys     0m20.919s

to

Mass-attach fentry/fexit:
=========================
real    0m2.728s
user    0m0.329s
sys     0m2.380s

It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
preparatory steps not related to fentry/fexit stuff.

It's not exactly apples-to-apples, though, because the limitations you
have right now prevents attaching both fentry and fexit programs to
the same set of kernel functions. This makes it pretty useless for a
lot of cases, in particular for retsnoop. So I haven't really tested
retsnoop end-to-end, I only verified that I do see fentries triggered,
but can't have matching fexits. So the speed-up might be smaller due
to additional fexit mass-attach (once that is allowed), but it's still
a massive difference. So we absolutely need to get this optimization
in.

Few more thoughts, if you'd like to plan some more work ahead ;)

1. We need similar mass-attach functionality for kprobe/kretprobe, as
there are use cases where kprobe are more useful than fentry (e.g., >6
args funcs, or funcs with input arguments that are not supported by
BPF verifier, like struct-by-value). It's not clear how to best
represent this, given currently we attach kprobe through perf_event,
but we'll need to think about this for sure.

2. To make mass-attach fentry/fexit useful for practical purposes, it
would be really great to have an ability to fetch traced function's
IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
would return IP of that functions that matches the one in
/proc/kallsyms. Right now I do very brittle hacks to do that.

So all-in-all, super excited about this, but I hope all those issues
are addressed to make retsnoop possible and fast.

  [0] https://github.com/anakryiko/retsnoop/commit/8a07bc4d8c47d025f755c108f92f0583e3fda6d8

>
>
> Also available at:
>   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   bpf/batch
>
> thanks,
> jirka
>
>
> [1] https://lore.kernel.org/bpf/20210413121516.1467989-1-jolsa@kernel.org/
>
> ---
> Jiri Olsa (17):
>       x86/ftrace: Remove extra orig rax move
>       tracing: Add trampoline/graph selftest
>       ftrace: Add ftrace_add_rec_direct function
>       ftrace: Add multi direct register/unregister interface
>       ftrace: Add multi direct modify interface
>       ftrace/samples: Add multi direct interface test module
>       bpf, x64: Allow to use caller address from stack
>       bpf: Allow to store caller's ip as argument
>       bpf: Add support to load multi func tracing program
>       bpf: Add bpf_trampoline_alloc function
>       bpf: Add support to link multi func tracing program
>       libbpf: Add btf__find_by_pattern_kind function
>       libbpf: Add support to link multi func tracing program
>       selftests/bpf: Add fentry multi func test
>       selftests/bpf: Add fexit multi func test
>       selftests/bpf: Add fentry/fexit multi func test
>       selftests/bpf: Temporary fix for fentry_fexit_multi_test
>
> Steven Rostedt (VMware) (2):
>       x86/ftrace: Remove fault protection code in prepare_ftrace_return
>       x86/ftrace: Make function graph use ftrace directly
>

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ