netdev - Re: [PATCH v6 5/6] net: ipv4, ipv6: run cgroup eBPF egress programs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160922092138.GA12108@salvia>
Date:   Thu, 22 Sep 2016 11:21:38 +0200
From:   Pablo Neira Ayuso <pablo@...filter.org>
To:     Thomas Graf <tgraf@...g.ch>
Cc:     Daniel Mack <daniel@...que.org>, htejun@...com,
        daniel@...earbox.net, ast@...com, davem@...emloft.net,
        kafai@...com, fw@...len.de, harald@...hat.com,
        netdev@...r.kernel.org, sargun@...gun.me, cgroups@...r.kernel.org
Subject: Re: [PATCH v6 5/6] net: ipv4, ipv6: run cgroup eBPF egress programs

On Wed, Sep 21, 2016 at 08:48:27PM +0200, Thomas Graf wrote:
> On 09/21/16 at 05:45pm, Pablo Neira Ayuso wrote:
> > On Tue, Sep 20, 2016 at 06:43:35PM +0200, Daniel Mack wrote:
> > > The point is that from an application's perspective, restricting the
> > > ability to bind a port and dropping packets that are being sent is a
> > > very different thing. Applications will start to behave differently if
> > > they can't bind to a port, and that's something we do not want to happen.
> > 
> > What is exactly the problem? Applications are not checking for return
> > value from bind? They should be fixed. If you want to collect
> > statistics, I see no reason why you couldn't collect them for every
> > EACCESS on each bind() call.
> 
> It's not about applications not checking the return value of bind().
> Unfortunately, many applications (or the respective libraries they use)
> retry on connect() failure but handle bind() errors as a hard failure
> and exit. Yes, it's an application or library bug but these
> applications have very specific exceptions how something fails.
> Sometimes even going from drop to RST will break applications.
> 
> Paranoia speaking: by returning errors where no error was returned
> before, undefined behaviour occurs. In Murphy speak: things break.
> 
> This is given and we can't fix it from the kernel side. Returning at
> system call level has many benefits but it's not always an option.
> 
> Adding the late hook does not prevent filtering at socket layer to
> also be added. I think we need both.

I have a hard time to buy this new specific hook, I think we should
shift focus of this debate, this is my proposal to untangle this:

You add a net/netfilter/nft_bpf.c expression that allows you to run
bpf programs from nf_tables. This expression can either run bpf
programs in a similar fashion to tc+bpf or run the bpf program that
you have attached to the cgroup.

To achieve this, I'd suggest you also add a new bpf chain type. That
new chain type would basically provide raw access to netfilter hooks
via nf_tables netlink interface.  This bpf chain would exclusively
take rules that use this new bpf expression.

I see good things on this proposal:

* This is consistent to what we offer via tc+bpf.

* It becomes easily visible to the user that a bpf program is running
  from the packet path, or any cgroup+bpf filtering is going on. Thus,
  no matter what those orchestrators do, this filtering becomes
  visible to sysadmins that are familiar with the existing command line
  tooling.

* You get access to all of the existing netfilter hooks in one go.

A side note on this: I would suggest this conversation focuses on
discussing aspects at a slightly higher level rather than counting raw
load and stores instructions...  I think this effort requires looking
at the whole forest, instead barfing at one single tree. Genericity
always comes at a slight cost, and to all those programmability fans
here, please remember we have a generic stack between hands after all.
So let's try to accomodate this new requirements in a way that makes
sense.

Thanks.