lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 30 Apr 2020 01:33:45 +0200 From: Daniel Borkmann <daniel@...earbox.net> To: Song Liu <songliubraving@...com>, bpf@...r.kernel.org, netdev@...r.kernel.org Cc: ast@...nel.org, kernel-team@...com Subject: Re: [PATCH v8 bpf-next 1/3] bpf: sharing bpf runtime stats with BPF_ENABLE_STATS On 4/29/20 8:45 AM, Song Liu wrote: > Currently, sysctl kernel.bpf_stats_enabled controls BPF runtime stats. > Typical userspace tools use kernel.bpf_stats_enabled as follows: > > 1. Enable kernel.bpf_stats_enabled; > 2. Check program run_time_ns; > 3. Sleep for the monitoring period; > 4. Check program run_time_ns again, calculate the difference; > 5. Disable kernel.bpf_stats_enabled. > > The problem with this approach is that only one userspace tool can toggle > this sysctl. If multiple tools toggle the sysctl at the same time, the > measurement may be inaccurate. > > To fix this problem while keep backward compatibility, introduce a new > bpf command BPF_ENABLE_STATS. On success, this command enables stats and > returns a valid fd. BPF_ENABLE_STATS takes argument "type". Currently, > only one type, BPF_STATS_RUN_TIME, is supported. We can extend the > command to support other types of stats in the future. > > With BPF_ENABLE_STATS, user space tool would have the following flow: > > 1. Get a fd with BPF_ENABLE_STATS, and make sure it is valid; > 2. Check program run_time_ns; > 3. Sleep for the monitoring period; > 4. Check program run_time_ns again, calculate the difference; > 5. Close the fd. > > Signed-off-by: Song Liu <songliubraving@...com> [...] > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index d23c04cbe14f..8691b2cc550d 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -3872,6 +3872,60 @@ static int bpf_link_get_fd_by_id(const union bpf_attr *attr) > return fd; > } > > +DEFINE_MUTEX(bpf_stats_enabled_mutex); > + > +static int bpf_stats_release(struct inode *inode, struct file *file) > +{ > + mutex_lock(&bpf_stats_enabled_mutex); > + static_key_slow_dec(&bpf_stats_enabled_key.key); > + mutex_unlock(&bpf_stats_enabled_mutex); > + return 0; > +} > + > +static const struct file_operations bpf_stats_fops = { > + .release = bpf_stats_release, > +}; > + > +static int bpf_enable_runtime_stats(void) > +{ > + int fd; > + > + mutex_lock(&bpf_stats_enabled_mutex); > + > + /* Set a very high limit to avoid overflow */ > + if (static_key_count(&bpf_stats_enabled_key.key) > INT_MAX / 2) { > + mutex_unlock(&bpf_stats_enabled_mutex); > + return -EBUSY; > + } > + > + fd = anon_inode_getfd("bpf-stats", &bpf_stats_fops, NULL, 0); Missing O_CLOEXEC or intentional (if latter, I'd have expected a comment here though)? > + if (fd >= 0) > + static_key_slow_inc(&bpf_stats_enabled_key.key); > + > + mutex_unlock(&bpf_stats_enabled_mutex); > + return fd; > +} > + > +#define BPF_ENABLE_STATS_LAST_FIELD enable_stats.type > + [...] > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index e961286d0e14..af08ef0690cb 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -201,6 +201,40 @@ static int max_extfrag_threshold = 1000; > > #endif /* CONFIG_SYSCTL */ > > +#ifdef CONFIG_BPF_SYSCALL > +static int bpf_stats_handler(struct ctl_table *table, int write, > + void __user *buffer, size_t *lenp, > + loff_t *ppos) > +{ > + struct static_key *key = (struct static_key *)table->data; > + static int saved_val; > + int val, ret; > + struct ctl_table tmp = { > + .data = &val, > + .maxlen = sizeof(val), > + .mode = table->mode, > + .extra1 = SYSCTL_ZERO, > + .extra2 = SYSCTL_ONE, > + }; > + > + if (write && !capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + mutex_lock(&bpf_stats_enabled_mutex); > + val = saved_val; > + ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); > + if (write && !ret && val != saved_val) { > + if (val) > + static_key_slow_inc(key); > + else > + static_key_slow_dec(key); > + saved_val = val; > + } > + mutex_unlock(&bpf_stats_enabled_mutex); > + return ret; > +} nit: I wonder if most of the logic could have been shared with proc_do_static_key() here and only the mutex passed as an arg to the common helper? > +#endif > + > /* > * /proc/sys support > */ > @@ -2549,7 +2583,7 @@ static struct ctl_table kern_table[] = { > .data = &bpf_stats_enabled_key.key, > .maxlen = sizeof(bpf_stats_enabled_key), > .mode = 0644,
Powered by blists - more mailing lists