[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160919201322.GA84770@ast-mbp.thefacebook.com>
Date: Mon, 19 Sep 2016 13:13:27 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Pablo Neira Ayuso <pablo@...filter.org>
Cc: Daniel Mack <daniel@...que.org>, htejun@...com,
daniel@...earbox.net, ast@...com, davem@...emloft.net,
kafai@...com, fw@...len.de, harald@...hat.com,
netdev@...r.kernel.org, sargun@...gun.me, cgroups@...r.kernel.org
Subject: Re: [PATCH v6 5/6] net: ipv4, ipv6: run cgroup eBPF egress programs
On Mon, Sep 19, 2016 at 09:19:10PM +0200, Pablo Neira Ayuso wrote:
> On Mon, Sep 19, 2016 at 06:44:00PM +0200, Daniel Mack wrote:
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index 6001e78..5dc90aa 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -39,6 +39,7 @@
> > #include <linux/module.h>
> > #include <linux/slab.h>
> >
> > +#include <linux/bpf-cgroup.h>
> > #include <linux/netfilter.h>
> > #include <linux/netfilter_ipv6.h>
> >
> > @@ -143,6 +144,7 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> > {
> > struct net_device *dev = skb_dst(skb)->dev;
> > struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
> > + int ret;
> >
> > if (unlikely(idev->cnf.disable_ipv6)) {
> > IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS);
> > @@ -150,6 +152,12 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> > return 0;
> > }
> >
> > + ret = cgroup_bpf_run_filter(sk, skb, BPF_CGROUP_INET_EGRESS);
> > + if (ret) {
> > + kfree_skb(skb);
> > + return ret;
> > + }
>
> 1) If your goal is to filter packets, why so late? The sooner you
> enforce your policy, the less cycles you waste.
>
> Actually, did you look at Google's approach to this problem? They
> want to control this at socket level, so you restrict what the process
> can actually bind. That is enforcing the policy way before you even
> send packets. On top of that, what they submitted is infrastructured
> so any process with CAP_NET_ADMIN can access that policy that is being
> applied and fetch a readable policy through kernel interface.
>
> 2) This will turn the stack into a nightmare to debug I predict. If
> any process with CAP_NET_ADMIN can potentially attach bpf blobs
> via these hooks, we will have to include in the network stack
a process without CAP_NET_ADMIN can attach bpf blobs to
system calls via seccomp. bpf is already used for security and policing.
> traveling documentation something like: "Probably you have to check
> that your orchestrator is not dropping your packets for some
> reason". So I wonder how users will debug this and how the policy that
> your orchestrator applies will be exposed to userspace.
as far as bpf debuggability/visibility there are various efforts on the way:
for kernel side:
- ksym for jit-ed programs
- hash sum for prog code
- compact type information for maps and various pretty printers
- data flow analysis of the programs
for user space:
- from bpf asm reconstruct the program in the high level language
(there is p4 to bpf, this effort is about bpf to p4)
Powered by blists - more mailing lists