[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNjFtOFG2xGr9MZYO5rTPBqoV06MEAGqsgiQQqrudtJY0w@mail.gmail.com>
Date: Wed, 7 Feb 2018 22:11:16 +0100
From: Björn Töpel <bjorn.topel@...il.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>
Cc: "Karlsson, Magnus" <magnus.karlsson@...el.com>,
"Duyck, Alexander H" <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <ast@...com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Netdev <netdev@...r.kernel.org>,
Björn Töpel <bjorn.topel@...el.com>,
michael.lundkvist@...csson.com,
"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
"Singhai, Anjali" <anjali.singhai@...el.com>,
"Shaw, Jeffrey B" <jeffrey.b.shaw@...el.com>,
"Yigit, Ferruh" <ferruh.yigit@...el.com>,
"Zhang, Qi Z" <qi.z.zhang@...el.com>
Subject: Re: [RFC PATCH 05/24] bpf: added bpf_xdpsk_redirect
2018-02-05 14:42 GMT+01:00 Jesper Dangaard Brouer <brouer@...hat.com>:
> On Wed, 31 Jan 2018 14:53:37 +0100 Björn Töpel <bjorn.topel@...il.com> wrote:
>
>> The bpf_xdpsk_redirect call redirects the XDP context to the XDP
>> socket bound to the receiving queue (if any).
>
> As I explained in-person at FOSDEM, my suggestion is to use the
> bpf-map infrastructure for AF_XDP redirect, but people on this
> upstream mailing also need a chance to validate my idea ;-)
>
> The important thing to keep in-mind is how we can still maintain a
> SPSC (Single producer Single Consumer) relationship between an
> RX-queue and a userspace consumer-process.
>
> This AF_XDP "FOSDEM" patchset, store the "xsk" (xdp_sock) pointer
> directly in the net_device (_rx[].netdev_rx_queue.xs) structure. This
> limit each RX-queue to service a single xdp_sock. It sounds good from
> a SPSC pov, but not very flexible. With a "xdp_sock_map" we can get
> the flexibility to select among multiple xdp_sock'ets (via XDP
> pre-filter selecting a different map), and still maintain a SPSC
> relationship. The RX-queue will just have several SPSC relationships
> with the individual xdp_sock's.
>
> This is true for the AF_XDP-copy mode, and require less driver change.
> For the AF_XDP-zero-copy (ZC) mode drivers need significant changes
> anyhow, and in ZC case we will have to disallow this multiple
> xdp_sock's, which is simply achieved by checking if the xdp_sock
> pointer returned from the map lookup match the one that userspace
> requested driver to register it's memory for RX-rings from.
>
> The "xdp_sock_map" is an array, where the index correspond to the
> queue_index. The bpf_redirect_map() ignore the specified index and
> instead use xdp_rxq_info->queue_index in the lookup.
>
> Notice that a bpf-map have no pinned relationship with the device or
> XDP prog loaded. Thus, userspace need to bind() this map to the
> device before traffic can flow, like the proposed bind() on the
> xdp_sock. This is to establish the SPSC binding. My proposal is that
> userspace insert the xdp_sock file-descriptor(s) in the map at the
> queue-index, and the map (which is also just a file-descriptor) is
> bound maybe via bind() to a specific device (via the ifindex). Kernel
> side will walk the map and do required actions xdp_sock's in find in
> map.
>
As we discussed at FOSDEM, I like the idea of using a map. This also
opens up for configuring the AF_XDP sockets via bpf code, like sockmap
does.
I'll have a stab at adding an "xdp_sock_map/xskmap" or similar, and
also extending the cgroup sock_ops to support AF_XDP sockets, so that
the xskmap can be configured from bpf-land.
Björn
> TX-side is harder, as now multiple xdp_sock's can have the same
> queue-pair ID with the same net_device. But Magnus propose that this
> can be solved with hardware. As newer NICs have many TX-queue, and the
> queue-pair ID is just an external visible number, while the kernel
> internal structure can point to a dedicated TX-queue per xdp_sock.
>
> --
> Best regards,
> Jesper Dangaard Brouer
> MSc.CS, Principal Kernel Engineer at Red Hat
> LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists