netdev - Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8f7ab9f-545a-2f43-82a6-91332a301a77@fb.com>
Date:   Sun, 20 Jun 2021 09:56:20 -0700
From:   Yonghong Song <yhs@...com>
To:     Jiri Olsa <jolsa@...hat.com>
CC:     Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andriin@...com>,
        "Steven Rostedt (VMware)" <rostedt@...dmis.org>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...omium.org>, Daniel Xu <dxu@...uu.xyz>,
        Viktor Malik <vmalik@...hat.com>
Subject: Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for
 direct/tracing attach



On 6/19/21 10:09 AM, Jiri Olsa wrote:
> On Sat, Jun 19, 2021 at 09:19:57AM -0700, Yonghong Song wrote:
>>
>>
>> On 6/19/21 1:33 AM, Jiri Olsa wrote:
>>> On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
>>>> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@...nel.org> wrote:
>>>>>
>>>>> hi,
>>>>> saga continues.. ;-) previous post is in here [1]
>>>>>
>>>>> After another discussion with Steven, he mentioned that if we fix
>>>>> the ftrace graph problem with direct functions, he'd be open to
>>>>> add batch interface for direct ftrace functions.
>>>>>
>>>>> He already had prove of concept fix for that, which I took and broke
>>>>> up into several changes. I added the ftrace direct batch interface
>>>>> and bpf new interface on top of that.
>>>>>
>>>>> It's not so many patches after all, so I thought having them all
>>>>> together will help the review, because they are all connected.
>>>>> However I can break this up into separate patchsets if necessary.
>>>>>
>>>>> This patchset contains:
>>>>>
>>>>>     1) patches (1-4) that fix the ftrace graph tracing over the function
>>>>>        with direct trampolines attached
>>>>>     2) patches (5-8) that add batch interface for ftrace direct function
>>>>>        register/unregister/modify
>>>>>     3) patches (9-19) that add support to attach BPF program to multiple
>>>>>        functions
>>>>>
>>>>> In nutshell:
>>>>>
>>>>> Ad 1) moves the graph tracing setup before the direct trampoline
>>>>> prepares the stack, so they don't clash
>>>>>
>>>>> Ad 2) uses ftrace_ops interface to register direct function with
>>>>> all functions in ftrace_ops filter.
>>>>>
>>>>> Ad 3) creates special program and trampoline type to allow attachment
>>>>> of multiple functions to single program.
>>>>>
>>>>> There're more detailed desriptions in related changelogs.
>>>>>
>>>>> I have working bpftrace multi attachment code on top this. I briefly
>>>>> checked retsnoop and I think it could use the new API as well.
>>>>
>>>> Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
>>>> The ugly code is at [0] if you'd like to see what kind of changes I
>>>> needed to make to use this (it won't work if you check it out because
>>>> it needs your libbpf changes synced into submodule, which I only did
>>>> locally). But here are some learnings from that experiment both to
>>>> emphasize how important it is to make this work and how restrictive
>>>> are some of the current limitations.
>>>>
>>>> First, good news. Using this mass-attach API to attach to almost 1000
>>>> kernel functions goes from
>>>>
>>>> Plain fentry/fexit:
>>>> ===================
>>>> real    0m27.321s
>>>> user    0m0.352s
>>>> sys     0m20.919s
>>>>
>>>> to
>>>>
>>>> Mass-attach fentry/fexit:
>>>> =========================
>>>> real    0m2.728s
>>>> user    0m0.329s
>>>> sys     0m2.380s
>>>
>>> I did not meassured the bpftrace speedup, because the new code
>>> attached instantly ;-)
>>>
>>>>
>>>> It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
>>>> preparatory steps not related to fentry/fexit stuff.
>>>>
>>>> It's not exactly apples-to-apples, though, because the limitations you
>>>> have right now prevents attaching both fentry and fexit programs to
>>>> the same set of kernel functions. This makes it pretty useless for a
>>>
>>> hum, you could do link_update with fexit program on the link fd,
>>> like in the selftest, right?
>>>
>>>> lot of cases, in particular for retsnoop. So I haven't really tested
>>>> retsnoop end-to-end, I only verified that I do see fentries triggered,
>>>> but can't have matching fexits. So the speed-up might be smaller due
>>>> to additional fexit mass-attach (once that is allowed), but it's still
>>>> a massive difference. So we absolutely need to get this optimization
>>>> in.
>>>>
>>>> Few more thoughts, if you'd like to plan some more work ahead ;)
>>>>
>>>> 1. We need similar mass-attach functionality for kprobe/kretprobe, as
>>>> there are use cases where kprobe are more useful than fentry (e.g., >6
>>>> args funcs, or funcs with input arguments that are not supported by
>>>> BPF verifier, like struct-by-value). It's not clear how to best
>>>> represent this, given currently we attach kprobe through perf_event,
>>>> but we'll need to think about this for sure.
>>>
>>> I'm fighting with the '2 trampolines concept' at the moment, but the
>>> mass attach for kprobes seems interesting ;-) will check
>>>
>>>>
>>>> 2. To make mass-attach fentry/fexit useful for practical purposes, it
>>>> would be really great to have an ability to fetch traced function's
>>>> IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
>>>> would return IP of that functions that matches the one in
>>>> /proc/kallsyms. Right now I do very brittle hacks to do that.
>>>
>>> so I hoped that we could store ip always in ctx-8 and have
>>> the bpf_get_func_ip helper to access that, but the BPF_PROG
>>> macro does not pass ctx value to the program, just args
>>
>> ctx does pass to the bpf program. You can check BPF_PROG
>> macro definition.
> 
> ah right, should have checked it.. so how about we change
> trampoline code to store ip in ctx-8 and make bpf_get_func_ip(ctx)
> to return [ctx-8]

This should work. Thanks!

> 
> I'll need to check if it's ok for the tracing helper to take
> ctx as argument
> 
> thanks,
> jirka
>