netdev - Re: [RFC PATCH bpf-next] bpf: Introduce bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <60b7d1f7e3640_5c74020841@john-XPS-13-9370.notmuch>
Date:   Wed, 02 Jun 2021 11:46:15 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     Kumar Kartikeya Dwivedi <memxor@...il.com>,
        Martin KaFai Lau <kafai@...com>
Cc:     Toke Høiland-Jørgensen <toke@...hat.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Cong Wang <xiyou.wangcong@...il.com>,
        David Miller <davem@...emloft.net>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        Lorenz Bauer <lmb@...udflare.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, kernel-team <kernel-team@...com>
Subject: Re: [RFC PATCH bpf-next] bpf: Introduce bpf_timer

Kumar Kartikeya Dwivedi wrote:
> On Wed, Jun 02, 2021 at 11:24:36PM IST, Martin KaFai Lau wrote:
> > On Wed, Jun 02, 2021 at 10:48:02AM +0200, Toke Høiland-Jørgensen wrote:
> > > Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
> > >
> > > >> > In general the garbage collection in any form doesn't scale.
> > > >> > The conntrack logic doesn't need it. The cillium conntrack is a great
> > > >> > example of how to implement a conntrack without GC.
> > > >>
> > > >> That is simply not a conntrack. We expire connections based on
> > > >> its time, not based on the size of the map where it residents.
> > > >
> > > > Sounds like your goal is to replicate existing kernel conntrack
> > > > as bpf program by doing exactly the same algorithm and repeating
> > > > the same mistakes. Then add kernel conntrack functions to allow list
> > > > of kfuncs (unstable helpers) and call them from your bpf progs.
> > >
> > > FYI, we're working on exactly this (exposing kernel conntrack to BPF).
> > > Hoping to have something to show for our efforts before too long, but
> > > it's still in a bit of an early stage...
> > Just curious, what conntrack functions will be made callable to BPF?
> 
> Initially we're planning to expose the equivalent of nf_conntrack_in and
> nf_conntrack_confirm to XDP and TC programs (so XDP one works without an skb,
> and TC one works with an skb), to map these to higher level lookup/insert.
> 
> --
> Kartikeya

I think this is a missed opportunity. I can't see any advantage to
tying a XDP datapath into nft. For local connections use a socket lookup
no need for tables at all. For middle boxes you need some tables, but
again really don't see why you want nft here. An entirely XDP based
connection tracker is going to be faster, easier to debug, and
more easy to tune to do what you want as your use cases changes.

Other than architecture disagreements, the implementation of this
gets ugly. You will need to export a set of nft hooks, teach nft
about xdp_buffs and then on every packet poke nft. Just looking
at nf_conntrack_in() tells me you likely need some serious surgery
there to make this work and now you've forked a bunch of code that
could be done generically in BPF into some C hard coded stuff you
will have to maintain. Or you do an ugly hack to convert xdp into
skb on every packet, but I'll NAK that because its really defeats
the point of XDP. Maybe TC side is easier because you have skb,
but then you miss the real win in XDP side. Sorry I don't see any
upsides here and just more work to review, maintain code that is
dubious to start with.

Anyways original timers code above LGTM.

.John