netdev - Re: Per-queue XDP programs, thoughts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190416141759.309f6435@cakuba.netronome.com>
Date:   Tue, 16 Apr 2019 14:17:59 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Björn Töpel <bjorn.topel@...il.com>
Cc:     Jesper Dangaard Brouer <brouer@...hat.com>,
        Björn Töpel 
        <bjorn.topel@...el.com>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Toke Høiland-Jørgensen 
        <toke@...hat.com>, "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        maciej.fijalkowski@...el.com, Jason Wang <jasowang@...hat.com>,
        Alexei Starovoitov <ast@...com>,
        Daniel Borkmann <borkmann@...earbox.net>,
        John Fastabend <john.fastabend@...il.com>,
        David Miller <davem@...emloft.net>,
        Andy Gospodarek <andy@...yhouse.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Thomas Graf <tgraf@...g.ch>,
        Thomas Monjalon <thomas@...jalon.net>,
        Jonathan Lemon <bsd@...com>
Subject: Re: Per-queue XDP programs, thoughts

On Tue, 16 Apr 2019 09:45:24 +0200, Björn Töpel wrote:
> > > > If we'd like to slice a netdevice into multiple queues. Isn't macvlan
> > > > or similar *virtual* netdevices a better path, instead of introducing
> > > > yet another abstraction?  
> >
> > Yes, the question of use cases is extremely important.  It seems
> > Mellanox is working on "spawning devlink ports" IOW slicing a device
> > into subdevices.  Which is a great way to run bifurcated DPDK/netdev
> > applications :/  If that gets merged I think we have to recalculate
> > what purpose AF_XDP is going to serve, if any.
> 
> I really like the subdevice-think, but let's have the drivers in the
> kernel. I don't see how the XDP view (including AF_XDP) changes with
> subdevices. My view on AF_XDP is that it's a socket that can
> receive/send data efficiently from/to the kernel. What subdevice
> *might* change is the requirement for a per-queue XDP program.

My worry is that the sockets are not expressive enough.  You can't have
a flower rule that forwards to a socket.  You can't have a flower rule
which forwards to an RSS context (AFAIK).  We have a model for doing
those things with port netdevs (A(incorrectly)KA representors).

> > > That is actually the reason I want XDP per-queue, as it is a way to
> > > offload the filtering to the hardware.  And if the per-queue XDP-prog
> > > becomes simple enough, the hardware can eliminate and do everything in
> > > hardware (hopefully).
> > >  
> > > > The control plane should IMO be outside of the XDP program.  
> >
> > ENOCOMPUTE :)  XDP program is the BPF byte code, it's never control
> > plance.  Do you mean application should not control the "context/
> > channel/subdev" creation?  
> 
> Yes, but I'm not sure. I'd like to hear more opinions.
> 
> Let me try to think out loud here. Say that per-queue XDP programs
> exist. The main XDP program receives all packets and makes the
> decision that a certain flow should end up in say queue X, and that
> the hardware supports offloading that. Should the knobs to program the
> hardware be in via BPF or by some other mechanism (perf ring to
> userland daemon)? Further, setting the XDP program per queue. Should
> that be done via XDP (the main XDP program has knowledge of all
> programs) or via say netlink (as XDP is today). One could argue that
> the per-queue setup should be a map (like tail-calls).

This is a philosophical discussion reminiscent of Saeed's control map
proposal.

I don't like the idea of purposefully shoehorning the networking
configuration into special maps.  It should probably be judged on
patch-by-patch basis, tho.

> > You're not saying "it's not the XDP program
> > which should be making the classification", no?  XDP program
> > controlling the classification was _the_ reason why we liked AF_XDP :)  
> 
> XDP program not doing classification would be weird. But if there's a
> scenario where *everything for a certain HW filter* end up in an
> AF_XDP queue, should we require an XDP program. I've been going back
> and forth here... :-)