netdev - Re: [PATCH v2 bpf-next] bpf: sharing bpf runtime stats with /dev/bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eb31bed3-3be4-501e-4340-bd558b31ead2@iogearbox.net>
Date:   Tue, 17 Mar 2020 20:30:31 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Song Liu <songliubraving@...com>, linux-fsdevel@...r.kernel.org,
        netdev@...r.kernel.org, bpf@...r.kernel.org
Cc:     kernel-team@...com, ast@...nel.org, mcgrof@...nel.org,
        keescook@...omium.org, yzaikin@...gle.com
Subject: Re: [PATCH v2 bpf-next] bpf: sharing bpf runtime stats with
 /dev/bpf_stats

On 3/16/20 9:33 PM, Song Liu wrote:
> Currently, sysctl kernel.bpf_stats_enabled controls BPF runtime stats.
> Typical userspace tools use kernel.bpf_stats_enabled as follows:
> 
>    1. Enable kernel.bpf_stats_enabled;
>    2. Check program run_time_ns;
>    3. Sleep for the monitoring period;
>    4. Check program run_time_ns again, calculate the difference;
>    5. Disable kernel.bpf_stats_enabled.
> 
> The problem with this approach is that only one userspace tool can toggle
> this sysctl. If multiple tools toggle the sysctl at the same time, the
> measurement may be inaccurate.
> 
> To fix this problem while keep backward compatibility, introduce a new
> bpf command BPF_ENABLE_RUNTIME_STATS. On success, this command enables
> run_time_ns stats and returns a valid fd.
> 
> With BPF_ENABLE_RUNTIME_STATS, user space tool would have the following
> flow:
> 
>    1. Get a fd with BPF_ENABLE_RUNTIME_STATS, and make sure it is valid;
>    2. Check program run_time_ns;
>    3. Sleep for the monitoring period;
>    4. Check program run_time_ns again, calculate the difference;
>    5. Close the fd.
> 
> Signed-off-by: Song Liu <songliubraving@...com>

Hmm, I see no relation to /dev/bpf_stats anymore, yet the subject still talks
about it?

Also, should this have bpftool integration now that we have `bpftool prog profile`
support? Would be nice to then fetch the related stats via bpf_prog_info, so users
can consume this in an easy way.

> Changes RFC => v2:
> 1. Add a new bpf command instead of /dev/bpf_stats;
> 2. Remove the jump_label patch, which is no longer needed;
> 3. Add a static variable to save previous value of the sysctl.
> ---
>   include/linux/bpf.h            |  1 +
>   include/uapi/linux/bpf.h       |  1 +
>   kernel/bpf/syscall.c           | 43 ++++++++++++++++++++++++++++++++++
>   kernel/sysctl.c                | 36 +++++++++++++++++++++++++++-
>   tools/include/uapi/linux/bpf.h |  1 +
>   5 files changed, 81 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 4fd91b7c95ea..d542349771df 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -970,6 +970,7 @@ _out:							\
>   
>   #ifdef CONFIG_BPF_SYSCALL
>   DECLARE_PER_CPU(int, bpf_prog_active);
> +extern struct mutex bpf_stats_enabled_mutex;
>   
>   /*
>    * Block execution of BPF programs attached to instrumentation (perf,
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 40b2d9476268..8285ff37210c 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -111,6 +111,7 @@ enum bpf_cmd {
>   	BPF_MAP_LOOKUP_AND_DELETE_BATCH,
>   	BPF_MAP_UPDATE_BATCH,
>   	BPF_MAP_DELETE_BATCH,
> +	BPF_ENABLE_RUNTIME_STATS,
>   };
>   
>   enum bpf_map_type {
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index b2f73ecacced..823dc9de7953 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -24,6 +24,9 @@
>   #include <linux/ctype.h>
>   #include <linux/nospec.h>
>   #include <linux/audit.h>
> +#include <linux/miscdevice.h>

Is this still needed?

> +#include <linux/fs.h>
> +#include <linux/jump_label.h>
>   #include <uapi/linux/btf.h>
>   
>   #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
> @@ -3550,6 +3553,43 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
>   	return err;
>   }
>   

Thanks,
Daniel