[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <920839AF-AC7A-4CD3-975F-111C3C6F75B9@fb.com>
Date: Tue, 17 Mar 2020 19:54:43 +0000
From: Song Liu <songliubraving@...com>
To: Daniel Borkmann <daniel@...earbox.net>
CC: "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
Networking <netdev@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
Kernel Team <Kernel-team@...com>,
"ast@...nel.org" <ast@...nel.org>,
"mcgrof@...nel.org" <mcgrof@...nel.org>,
"keescook@...omium.org" <keescook@...omium.org>,
"yzaikin@...gle.com" <yzaikin@...gle.com>
Subject: Re: [PATCH v2 bpf-next] bpf: sharing bpf runtime stats with
/dev/bpf_stats
> On Mar 17, 2020, at 12:30 PM, Daniel Borkmann <daniel@...earbox.net> wrote:
>
> On 3/16/20 9:33 PM, Song Liu wrote:
>> Currently, sysctl kernel.bpf_stats_enabled controls BPF runtime stats.
>> Typical userspace tools use kernel.bpf_stats_enabled as follows:
>> 1. Enable kernel.bpf_stats_enabled;
>> 2. Check program run_time_ns;
>> 3. Sleep for the monitoring period;
>> 4. Check program run_time_ns again, calculate the difference;
>> 5. Disable kernel.bpf_stats_enabled.
>> The problem with this approach is that only one userspace tool can toggle
>> this sysctl. If multiple tools toggle the sysctl at the same time, the
>> measurement may be inaccurate.
>> To fix this problem while keep backward compatibility, introduce a new
>> bpf command BPF_ENABLE_RUNTIME_STATS. On success, this command enables
>> run_time_ns stats and returns a valid fd.
>> With BPF_ENABLE_RUNTIME_STATS, user space tool would have the following
>> flow:
>> 1. Get a fd with BPF_ENABLE_RUNTIME_STATS, and make sure it is valid;
>> 2. Check program run_time_ns;
>> 3. Sleep for the monitoring period;
>> 4. Check program run_time_ns again, calculate the difference;
>> 5. Close the fd.
>> Signed-off-by: Song Liu <songliubraving@...com>
>
> Hmm, I see no relation to /dev/bpf_stats anymore, yet the subject still talks
> about it?
My fault. Will fix..
>
> Also, should this have bpftool integration now that we have `bpftool prog profile`
> support? Would be nice to then fetch the related stats via bpf_prog_info, so users
> can consume this in an easy way.
We can add "run_time_ns" as a metric to "bpftool prog profile". But the
mechanism is not the same though. Let me think about this.
>
>> Changes RFC => v2:
>> 1. Add a new bpf command instead of /dev/bpf_stats;
>> 2. Remove the jump_label patch, which is no longer needed;
>> 3. Add a static variable to save previous value of the sysctl.
>> ---
>> include/linux/bpf.h | 1 +
>> include/uapi/linux/bpf.h | 1 +
>> kernel/bpf/syscall.c | 43 ++++++++++++++++++++++++++++++++++
>> kernel/sysctl.c | 36 +++++++++++++++++++++++++++-
>> tools/include/uapi/linux/bpf.h | 1 +
>> 5 files changed, 81 insertions(+), 1 deletion(-)
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index 4fd91b7c95ea..d542349771df 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -970,6 +970,7 @@ _out: \
>> #ifdef CONFIG_BPF_SYSCALL
>> DECLARE_PER_CPU(int, bpf_prog_active);
>> +extern struct mutex bpf_stats_enabled_mutex;
>> /*
>> * Block execution of BPF programs attached to instrumentation (perf,
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 40b2d9476268..8285ff37210c 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -111,6 +111,7 @@ enum bpf_cmd {
>> BPF_MAP_LOOKUP_AND_DELETE_BATCH,
>> BPF_MAP_UPDATE_BATCH,
>> BPF_MAP_DELETE_BATCH,
>> + BPF_ENABLE_RUNTIME_STATS,
>> };
>> enum bpf_map_type {
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index b2f73ecacced..823dc9de7953 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -24,6 +24,9 @@
>> #include <linux/ctype.h>
>> #include <linux/nospec.h>
>> #include <linux/audit.h>
>> +#include <linux/miscdevice.h>
>
> Is this still needed?
My fault again. Will fix.
Thanks,
Song
Powered by blists - more mailing lists