[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48ba6e77-1695-50b3-b27f-e82750ee70bb@redhat.com>
Date: Wed, 2 Nov 2022 15:06:41 +0100
From: Jesper Dangaard Brouer <jbrouer@...hat.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>,
Stanislav Fomichev <sdf@...gle.com>
Cc: brouer@...hat.com,
"Bezdeka, Florian" <florian.bezdeka@...mens.com>,
"kuba@...nel.org" <kuba@...nel.org>,
"john.fastabend@...il.com" <john.fastabend@...il.com>,
"alexandr.lobakin@...el.com" <alexandr.lobakin@...el.com>,
"anatoly.burakov@...el.com" <anatoly.burakov@...el.com>,
"song@...nel.org" <song@...nel.org>,
"Deric, Nemanja" <nemanja.deric@...mens.com>,
"andrii@...nel.org" <andrii@...nel.org>,
"Kiszka, Jan" <jan.kiszka@...mens.com>,
"magnus.karlsson@...il.com" <magnus.karlsson@...il.com>,
"willemb@...gle.com" <willemb@...gle.com>,
"ast@...nel.org" <ast@...nel.org>, "yhs@...com" <yhs@...com>,
"kpsingh@...nel.org" <kpsingh@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"mtahhan@...hat.com" <mtahhan@...hat.com>,
"xdp-hints@...-project.net" <xdp-hints@...-project.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"jolsa@...nel.org" <jolsa@...nel.org>,
"haoluo@...gle.com" <haoluo@...gle.com>,
Toke Hoiland Jorgensen <toke@...hat.com>
Subject: Re: [xdp-hints] Re: [RFC bpf-next 0/5] xdp: hints via kfuncs
On 01/11/2022 18.05, Martin KaFai Lau wrote:
> On 10/31/22 6:59 PM, Stanislav Fomichev wrote:
>> On Mon, Oct 31, 2022 at 3:57 PM Martin KaFai Lau
>> <martin.lau@...ux.dev> wrote:
>>>
>>> On 10/31/22 10:00 AM, Stanislav Fomichev wrote:
>>>>> 2. AF_XDP programs won't be able to access the metadata without
>>>>> using a
>>>>> custom XDP program that calls the kfuncs and puts the data into the
>>>>> metadata area. We could solve this with some code in libxdp,
>>>>> though; if
>>>>> this code can be made generic enough (so it just dumps the available
>>>>> metadata functions from the running kernel at load time), it may be
>>>>> possible to make it generic enough that it will be forward-compatible
>>>>> with new versions of the kernel that add new fields, which should
>>>>> alleviate Florian's concern about keeping things in sync.
>>>>
>>>> Good point. I had to convert to a custom program to use the kfuncs :-(
>>>> But your suggestion sounds good; maybe libxdp can accept some extra
>>>> info about at which offset the user would like to place the metadata
>>>> and the library can generate the required bytecode?
>>>>
>>>>> 3. It will make it harder to consume the metadata when building
>>>>> SKBs. I
>>>>> think the CPUMAP and veth use cases are also quite important, and that
>>>>> we want metadata to be available for building SKBs in this path. Maybe
>>>>> this can be resolved by having a convenient kfunc for this that can be
>>>>> used for programs doing such redirects. E.g., you could just call
>>>>> xdp_copy_metadata_for_skb() before doing the bpf_redirect, and that
>>>>> would recursively expand into all the kfunc calls needed to extract
>>>>> the
>>>>> metadata supported by the SKB path?
>>>>
>>>> So this xdp_copy_metadata_for_skb will create a metadata layout that
>>>
>>> Can the xdp_copy_metadata_for_skb be written as a bpf prog itself?
>>> Not sure where is the best point to specify this prog though.
>>> Somehow during
>>> bpf_xdp_redirect_map?
>>> or this prog belongs to the target cpumap and the xdp prog
>>> redirecting to this
>>> cpumap has to write the meta layout in a way that the cpumap is
>>> expecting?
>>
>> We're probably interested in triggering it from the places where xdp
>> frames can eventually be converted into skbs?
>> So for plain 'return XDP_PASS' and things like bpf_redirect/etc? (IOW,
>> anything that's not XDP_DROP / AF_XDP redirect).
>> We can probably make it magically work, and can generate
>> kernel-digestible metadata whenever data == data_meta, but the
>> question - should we?
>> (need to make sure we won't regress any existing cases that are not
>> relying on the metadata)
>
> Instead of having some kernel-digestible meta data, how about calling
> another bpf prog to initialize the skb fields from the meta area after
> __xdp_build_skb_from_frame() in the cpumap, so
> run_xdp_set_skb_fileds_from_metadata() may be a better name.
>
I very much like this idea of calling another bpf prog to initialize the
SKB fields from the meta area. (As a reminder, data need to come from
meta area, because at this point the hardware RX-desc is out-of-scope).
I'm onboard with xdp_copy_metadata_for_skb() populating the meta area.
We could invoke this BPF-prog inside __xdp_build_skb_from_frame().
We might need a new BPF_PROG_TYPE_XDP2SKB as this new BPF-prog
run_xdp_set_skb_fields_from_metadata() would need both xdp_buff + SKB as
context inputs. Right? (Not sure, if this is acceptable with the BPF
maintainers new rules)
> The xdp_prog@rx sets the meta data and then redirect. If the
> xdp_prog@rx can also specify a xdp prog to initialize the skb fields
> from the meta area, then there is no need to have a kfunc to enforce a
> kernel-digestible layout. Not sure what is a good way to specify this
> xdp_prog though...
The challenge of running this (BPF_PROG_TYPE_XDP2SKB) BPF-prog inside
__xdp_build_skb_from_frame() is that it need to know howto decode the
meta area for every device driver or XDP-prog populating this (as veth
and cpumap can get redirected packets from multiple device drivers).
Sure, using a common function/helper/macro like
xdp_copy_metadata_for_skb() could help reduce this multiplexing, but we
want to have maximum flexibility to extend this without having to update
the kernel, right.
Fortunately __xdp_build_skb_from_frame() have a net_device parameter,
that points to the device is was received on (xdp_frame->dev_rx).
Thus, we could extend net_device and add this BPF-prog on a per
net_device basis. This could function as a driver BPF-prog callback
that populates the SKB fields from the XDP meta data.
Is this a good or bad idea?
>>>> the kernel will be able to understand when converting back to skb?
>>>> IIUC, the xdp program will look something like the following:
>>>>
>>>> if (xdp packet is to be consumed by af_xdp) {
>>>> // do a bunch of bpf_xdp_metadata_<metadata> calls and assemble
>>>> your
>>>> own metadata layout
>>>> return bpf_redirect_map(xsk, ...);
>>>> } else {
>>>> // if the packet is to be consumed by the kernel
>>>> xdp_copy_metadata_for_skb(ctx);
>>>> return bpf_redirect(...);
>>>> }
>>>>
>>>> Sounds like a great suggestion! xdp_copy_metadata_for_skb can maybe
>>>> put some magic number in the first byte(s) of the metadata so the
>>>> kernel can check whether xdp_copy_metadata_for_skb has been called
>>>> previously (or maybe xdp_frame can carry this extra signal, idk).
I'm in favor of adding a flag bit to xdp_frame to signal this.
In __xdp_build_skb_from_frame() we could check this flag signal and then
invoke the BPF-prog type BPF_PROG_TYPE_XDP2SKB.
--Jesper
Powered by blists - more mailing lists