[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+hQ2+jzz2dZONYbW_+H6rE+u50a+r8p5yLtAWWSJFvjmnBz1g@mail.gmail.com>
Date: Tue, 17 Dec 2019 14:30:21 -0800
From: Luigi Rizzo <rizzo@....unipi.it>
To: Jesper Dangaard Brouer <brouer@...hat.com>
Cc: "Jubran, Samih" <sameehj@...zon.com>,
"Machulsky, Zorik" <zorik@...zon.com>,
Daniel Borkmann <borkmann@...earbox.net>,
David Miller <davem@...emloft.net>,
"Tzalik, Guy" <gtzalik@...zon.com>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Toke Høiland-Jørgensen <toke@...hat.com>,
"Kiyanovski, Arthur" <akiyano@...zon.com>,
Alexei Starovoitov <ast@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
David Ahern <dsahern@...il.com>
Subject: Re: XDP multi-buffer design discussion
On Tue, Dec 17, 2019 at 12:46 AM Jesper Dangaard Brouer
<brouer@...hat.com> wrote:
>
> On Mon, 16 Dec 2019 20:15:12 -0800
> Luigi Rizzo <rizzo@....unipi.it> wrote:
>...
> > For some use cases, the bpf program could deduct the total length
> > looking at the L3 header.
>
> Yes, that actually good insight. I guess the BPF-program could also
> use this to detect that it doesn't have access to the full-lineary
> packet this way(?)
>
> > It won't work for XDP_TX response though.
>
> The XDP_TX case also need to be discussed/handled. IMHO need to support
> XDP_TX for multi-buffer frames. XDP_TX *can* be driver specific, but
> most drivers choose to convert xdp_buff to xdp_frame, which makes it
> possible to use/share part of the XDP_REDIRECT code from ndo_xdp_xmit.
>
> We also need to handle XDP_REDIRECT, which becomes challenging, as the
> ndo_xdp_xmit functions of *all* drivers need to be updated (or
> introduce a flag to handle this incrementally).
Here is a possible course of action (please let me know if you find loose ends)
1. extend struct xdp_buff with a total length and sk_buff * (NULL by default);
2. add a netdev callback to construct the skb for the current packet.
This code obviously already in all drivers, just needs to be exposed
as function
callable by a bpf helper (next bullet);
3. add a new helper 'bpf_create_skb' that when invoked calls the previously
mentioned netdev callback to constructs an skb for the current packet,
and sets the pointer in the xdp_buff, if not there already.
A bpf program that needs to access segments beyond the first one can
call bpf_create_skb() if needed, and then use existing helpers
skb_load_bytes, skb_store_bytes, etc) to access the skb.
My rationale is that if we need to access multiple segments, we are already
in an expensive territory and it makes little sense to define a multi segment
format that would essentially be an skb.
4. implement a mechanism to let so the driver know whether the currently
loaded bpf program understands the new format.
There are multiple ways to do that, a trivial one would be to check,
during load,
that the program calls some known helper eg bpf_understands_fragments()
which is then jit-ed to somethijng inexpensive
Note that today, a netdev that cannot guarantee single segment
packets would not
be able to enable xdp. Hence, without loss of functionality, such
netdev can refuse to
load a program without bpf_undersdands_fragments().
With all the above, the generic xdp handler would do the following:
if (!skb_is_linear() && !bpf_understands_fragments()) {
< linearize skb>;
}
<construct xdp_buff with first segment and skb> // skb is unused by
old style programs
<call bpf program>
The native driver for a device that cannot guarantee a single segment
would just refuse
to load a program that does not understand them (same as today), so
the code would be:
<construct xdp_buff with first segment and empty skb>
<call bpf program>
On return, we might find that an skb has been built by the xdp program,
and can be immediately used for XDP_PASS (or dropped in case of XDP_DROP)
For XDP_TX and XDP_REDIRECT, something similar: if the packet is a
single segment
and there is no skb, use the existing accelerated path. If there are
multiple segments,
construct the skb if not existing already, and pass it to the standard tx path.
cheers
luigi
Powered by blists - more mailing lists