linux-kernel - Re: [PATCH v11 net-next 04/12] bpf: expand BPF syscall with program load/unload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5410062A.4090003@redhat.com>
Date:	Wed, 10 Sep 2014 10:04:58 +0200
From:	Daniel Borkmann <dborkman@...hat.com>
To:	Alexei Starovoitov <ast@...mgrid.com>
CC:	"David S. Miller" <davem@...emloft.net>,
	Ingo Molnar <mingo@...nel.org>,
	Linus Torvalds <torvalds@...uxfoundation.org>,
	Andy Lutomirski <luto@...capital.net>,
	Steven Rostedt <rostedt@...dmis.org>,
	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	Chema Gonzalez <chema@...gle.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Pablo Neira Ayuso <pablo@...filter.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...uxfoundation.org>,
	Kees Cook <keescook@...omium.org>, linux-api@...r.kernel.org,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v11 net-next 04/12] bpf: expand BPF syscall with program
 load/unload

On 09/10/2014 07:10 AM, Alexei Starovoitov wrote:
> eBPF programs are similar to kernel modules. They are loaded by the user
> process and automatically unloaded when process exits. Each eBPF program is
> a safe run-to-completion set of instructions. eBPF verifier statically
> determines that the program terminates and is safe to execute.
>
> The following syscall wrapper can be used to load the program:
> int bpf_prog_load(enum bpf_prog_type prog_type,
>                    const struct bpf_insn *insns, int insn_cnt,
>                    const char *license)
> {
>      union bpf_attr attr = {
>          .prog_type = prog_type,
>          .insns = insns,
>          .insn_cnt = insn_cnt,
>          .license = license,
>      };
>
>      return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
> }
> where 'insns' is an array of eBPF instructions and 'license' is a string
> that must be GPL compatible to call helper functions marked gpl_only
>
> Upon succesful load the syscall returns prog_fd.
> Use close(prog_fd) to unload the program.
>
> User space tests and examples follow in the later patches
>
> Signed-off-by: Alexei Starovoitov <ast@...mgrid.com>
...
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index 4b59edead908..9727616693e5 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -15,6 +15,7 @@
>   struct sk_buff;
>   struct sock;
>   struct seccomp_data;
> +struct bpf_prog_info;
>
>   /* ArgX, context and stack frame pointer register positions. Note,
>    * Arg1, Arg2, Arg3, etc are used as argument mappings of function
> @@ -302,8 +303,12 @@ struct bpf_work_struct {
>   struct bpf_prog {
>   	u16			pages;		/* Number of allocated pages */
>   	bool			jited;		/* Is our filter JIT'ed? */
> +	bool			has_info;	/* whether 'info' is valid */
>   	u32			len;		/* Number of filter blocks */
> -	struct sock_fprog_kern	*orig_prog;	/* Original BPF program */
> +	union {
> +		struct sock_fprog_kern	*orig_prog;	/* Original BPF program */
> +		struct bpf_prog_info	*info;
> +	};

All members of this bpf_prog_info should go into bpf_work_struct,
as I have intended this to be a ancillary structure here. Since
we already allocate this anyway, you can reduce complexity by doing
the additional allocation plus remove the has_info member.

>   	struct bpf_work_struct	*work;		/* Deferred free work struct */
>   	unsigned int		(*bpf_func)(const struct sk_buff *skb,
>   					    const struct bpf_insn *filter);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 3a03fdf4db0e..1d0411965576 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -99,12 +99,23 @@ enum bpf_cmd {
...
> +/* called by sockets/tracing/seccomp before attaching program to an event
> + * pairs with bpf_prog_put()
> + */

But seccomp already does refcounting on each BPF filter. Or, is the
intention to remove this from seccomp?

> +struct bpf_prog *bpf_prog_get(u32 ufd)
> +{
> +	struct fd f = fdget(ufd);
> +	struct bpf_prog *prog;
> +
> +	prog = get_prog(f);
> +
> +	if (IS_ERR(prog))
> +		return prog;
> +
> +	atomic_inc(&prog->info->refcnt);
> +	fdput(f);
> +	return prog;
> +}
...
> diff --git a/net/core/filter.c b/net/core/filter.c
> index dfc716ffa44b..d771e4f03745 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -835,6 +835,7 @@ static void bpf_release_orig_filter(struct bpf_prog *fp)
>   {
>   	struct sock_fprog_kern *fprog = fp->orig_prog;
>
> +	BUG_ON(fp->has_info);

Why BUG_ON() (also in so many other places)?

>   	if (fprog) {
>   		kfree(fprog->filter);
>   		kfree(fprog);
> @@ -973,6 +974,7 @@ static struct bpf_prog *bpf_prepare_filter(struct bpf_prog *fp)
>
>   	fp->bpf_func = NULL;
>   	fp->jited = false;
> +	fp->has_info = false;
>
>   	err = bpf_check_classic(fp->insns, fp->len);
>   	if (err) {
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/