netdev - Re: [PATCH v2 bpf-next] bpf: sharing bpf runtime stats with /dev/bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ba62e0be-6de6-036c-a836-178c1a9c079a@iogearbox.net>
Date:   Wed, 18 Mar 2020 21:58:07 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Song Liu <songliubraving@...com>
Cc:     "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>,
        "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>,
        "ast@...nel.org" <ast@...nel.org>,
        "mcgrof@...nel.org" <mcgrof@...nel.org>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "yzaikin@...gle.com" <yzaikin@...gle.com>
Subject: Re: [PATCH v2 bpf-next] bpf: sharing bpf runtime stats with
 /dev/bpf_stats

On 3/18/20 7:33 AM, Song Liu wrote:
>> On Mar 17, 2020, at 4:08 PM, Song Liu <songliubraving@...com> wrote:
>>> On Mar 17, 2020, at 2:47 PM, Daniel Borkmann <daniel@...earbox.net> wrote:
>>>>>
>>>>> Hm, true as well. Wouldn't long-term extending "bpftool prog profile" fentry/fexit
>>>>> programs supersede this old bpf_stats infrastructure? Iow, can't we implement the
>>>>> same (or even more elaborate stats aggregation) in BPF via fentry/fexit and then
>>>>> potentially deprecate bpf_stats counters?
>>>> I think run_time_ns has its own value as a simple monitoring framework. We can
>>>> use it in tools like top (and variations). It will be easier for these tools to
>>>> adopt run_time_ns than using fentry/fexit.
>>>
>>> Agree that this is easier; I presume there is no such official integration today
>>> in tools like top, right, or is there anything planned?
>>
>> Yes, we do want more supports in different tools to increase the visibility.
>> Here is the effort for atop: https://github.com/Atoptool/atop/pull/88 .
>>
>> I wasn't pushing push hard on this one mostly because the sysctl interface requires
>> a user space "owner".
>>
>>>> On the other hand, in long term, we may include a few fentry/fexit based programs
>>>> in the kernel binary (or the rpm), so that these tools can use them easily. At
>>>> that time, we can fully deprecate run_time_ns. Maybe this is not too far away?
>>>
>>> Did you check how feasible it is to have something like `bpftool prog profile top`
>>> which then enables fentry/fexit for /all/ existing BPF programs in the system? It
>>> could then sort the sample interval by run_cnt, cycles, cache misses, aggregated
>>> runtime, etc in a top-like output. Wdyt?
>>
>> I wonder whether we can achieve this with one bpf prog (or a trampoline) that covers
>> all BPF programs, like a trampoline inside __BPF_PROG_RUN()?
>>
>> For long term direction, I think we could compare two different approaches: add new
>> tools (like bpftool prog profile top) vs. add BPF support to existing tools. The
>> first approach is easier. The latter approach would show BPF information to users
>> who are not expecting BPF programs in the systems. For many sysadmins, seeing BPF
>> programs in top/ps, and controlling them via kill is more natural than learning
>> bpftool. What's your thought on this?
> 
> More thoughts on this.
> 
> If we have a special trampoline that attach to all BPF programs at once, we really
> don't need the run_time_ns stats anymore. Eventually, tools that monitor BPF
> programs will depend on libbpf, so using fentry/fexit to monitor BPF programs doesn't
> introduce extra dependency. I guess we also need a way to include BPF program in
> libbpf.
> 
> To summarize this plan, we need:
> 
> 1) A global trampoline that attaches to all BPF programs at once;

Overall sounds good, I think the `at once` part might be tricky, at least it would
need to patch one prog after another, each prog also needs to store its own metrics
somewhere for later collection. The start-to-sample could be a shared global var (aka
shared map between all the programs) which would flip the switch though.

> 2) Embed fentry/fexit program in libbpf, which will be used by tools for monitoring;
> 3) BPF helpers to read time, which replaces current run_time_ns.
> 
> Does this look reasonable?
> 
> Thanks,
> Song
>