[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKH8qBtDZo8Mmp=o_fomz97cXNGY6NgOOW8YbJCXx_+_dVf7uw@mail.gmail.com>
Date: Mon, 21 Nov 2022 09:53:02 -0800
From: Stanislav Fomichev <sdf@...gle.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: Jesper Dangaard Brouer <jbrouer@...hat.com>, bpf@...r.kernel.org,
brouer@...hat.com, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, martin.lau@...ux.dev, song@...nel.org,
yhs@...com, john.fastabend@...il.com, kpsingh@...nel.org,
haoluo@...gle.com, jolsa@...nel.org,
David Ahern <dsahern@...il.com>,
Jakub Kicinski <kuba@...nel.org>,
Willem de Bruijn <willemb@...gle.com>,
Anatoly Burakov <anatoly.burakov@...el.com>,
Alexander Lobakin <alexandr.lobakin@...el.com>,
Magnus Karlsson <magnus.karlsson@...il.com>,
Maryam Tahhan <mtahhan@...hat.com>, xdp-hints@...-project.net,
netdev@...r.kernel.org
Subject: Re: [xdp-hints] Re: [PATCH bpf-next 06/11] xdp: Carry over xdp
metadata into skb context
On Sat, Nov 19, 2022 at 4:31 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>
> Stanislav Fomichev <sdf@...gle.com> writes:
>
> > On Fri, Nov 18, 2022 at 6:05 AM Jesper Dangaard Brouer
> > <jbrouer@...hat.com> wrote:
> >>
> >>
> >> On 15/11/2022 04.02, Stanislav Fomichev wrote:
> >> > Implement new bpf_xdp_metadata_export_to_skb kfunc which
> >> > prepares compatible xdp metadata for kernel consumption.
> >> > This kfunc should be called prior to bpf_redirect
> >> > or when XDP_PASS'ing the frame into the kernel (note, the drivers
> >> > have to be updated to enable consuming XDP_PASS'ed metadata).
> >> >
> >> > veth driver is amended to consume this metadata when converting to skb.
> >> >
> >> > Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> >> > whether the frame has skb metadata. The metadata is currently
> >> > stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> >> > to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> >> > this requirement later on if needed, we'd have to memmove
> >> > xdp_skb_metadata).
> >> >
> >>
> >> I think it is wrong to refuses using metadata area (bpf_xdp_adjust_meta)
> >> when the function bpf_xdp_metadata_export_to_skb() have been called.
> >> In my design they were suppose to co-exist, and BPF-prog was expected to
> >> access this directly themselves.
> >>
> >> With this current design, I think it is better to place the struct
> >> xdp_skb_metadata (maybe call it xdp_skb_hints) after xdp_frame (in the
> >> top of the frame). This way we don't conflict with metadata and
> >> headroom use-cases. Plus, verifier will keep BPF-prog from accessing
> >> this area directly (which seems to be one of the new design goals).
> >>
> >> By placing it after xdp_frame, I think it would be possible to let veth
> >> unroll functions seamlessly access this info for XDP_REDIRECT'ed
> >> xdp_frame's.
> >>
> >> WDYT?
> >
> > Not everyone seems to be happy with exposing this xdp_skb_metadata via
> > uapi though :-(
> > So I'll drop this part in the v2 for now. But let's definitely keep
> > talking about the future approach.
>
> Jakub was objecting to putting it in the UAPI header, but didn't we
> already agree that this wasn't necessary?
>
> I.e., if we just define
>
> struct xdp_skb_metadata *bpf_xdp_metadata_export_to_skb()
>
> as a kfunc, the xdp_skb_metadata struct won't appear in any UAPI headers
> and will only be accessible via BTF? And we can put the actual data
> wherever we choose, since that bit is nicely hidden behind the kfunc,
> while the returned pointer still allows programs to access it.
>
> We could even make that kfunc smart enough that it checks if the field
> is already populated and just return the pointer to the existing data
> instead of re-populating it int his case (with a flag to override,
> maybe?).
Even if we only expose it via btf, I think the fact that we still
expose a somewhat fixed layout is the problem?
I'm not sure the fact that we're not technically putting in the uapi
header is the issue here, but maybe I'm wrong?
Jakub?
> > Putting it after xdp_frame SGTM; with this we seem to avoid the need
> > to memmove it on adjust_{head,meta}.
> >
> > But going back to the uapi part, what if we add separate kfunc
> > accessors for skb exported metadata?
> >
> > To export:
> > bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx, rx_timestamp)
> > bpf_xdp_metadata_export_rx_hash_to_skb(ctx, rx_hash)
> > // ^^ these prepare xdp_skb_metadata after xdp_frame, but not expose
> > it via uapi/af_xdp/etc
> >
> > Then bpf_xdp_metadata_export_to_skb can be 'static inline' define in
> > the headers:
> >
> > void bpf_xdp_metadata_export_to_skb(ctx)
> > {
> > if (bpf_xdp_metadata_rx_timestamp_supported(ctx))
> > bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx,
> > bpf_xdp_metadata_rx_timestamp(ctx));
> > if (bpf_xdp_metadata_rx_hash_supported(ctx))
> > bpf_xdp_metadata_export_rx_hash_to_skb(ctx, bpf_xdp_metadata_rx_hash(ctx));
> > }
>
> The problem with this is that the BPF programs then have to keep up with
> the kernel. I.e., if the kernel later adds support for a new field that
> is used in the SKB, old XDP programs won't be populating it, which seems
> suboptimal. I think rather the kernel should be in control of the SKB
> metadata, and just allow XDP to consume it (and change individual fields
> as needed).
Good point. Although doesn't sound like a huge drawback to me? If that
bpf_xdp_metadata_export_to_skb is a part of libbpf/libxdp, the new
fields will get populated after a library update..
> > The only issue, it seems, is that if the final bpf program would like
> > to export this metadata to af_xdp, it has to manually adj_meta and use
> > bpf_xdp_metadata_skb_rx_xxx to prepare a custom layout. Not sure
> > whether performance would suffer with this extra copy; but we can at
> > least try and see..
>
> If we write the metadata after the packet data, that could still be
> transferred to AF_XDP, couldn't it? Userspace would just have to know
> how to find and read it, like it would if it's before the metadata.
Right, but here we again bump into the fact that we need to somehow
communicate that layout to the userspace (via btf ids) which doesn't
make everybody excited :-)
Powered by blists - more mailing lists