[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yz8O0rn7ps/m8iGi@google.com>
Date: Thu, 6 Oct 2022 10:22:26 -0700
From: sdf@...gle.com
To: Maryam Tahhan <mtahhan@...hat.com>
Cc: "Toke Høiland-Jørgensen" <toke@...hat.com>,
Jakub Kicinski <kuba@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Jesper Dangaard Brouer <jbrouer@...hat.com>,
brouer@...hat.com, bpf@...r.kernel.org, netdev@...r.kernel.org,
xdp-hints@...-project.net, larysa.zaremba@...el.com,
memxor@...il.com, Lorenzo Bianconi <lorenzo@...nel.org>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Borkmann <borkmann@...earbox.net>,
Andrii Nakryiko <andrii.nakryiko@...il.com>,
dave@...cker.co.uk, Magnus Karlsson <magnus.karlsson@...el.com>,
bjorn@...nel.org
Subject: Re: [xdp-hints] Re: [PATCH RFCv2 bpf-next 00/18] XDP-hints: XDP
gaining access to HW offload hints via BTF
On 10/06, Maryam Tahhan wrote:
> On 05/10/2022 19:47, sdf@...gle.com wrote:
> > On 10/05, Toke H�iland-J�rgensen wrote:
> > > Stanislav Fomichev <sdf@...gle.com> writes:
> >
> > > > On Tue, Oct 4, 2022 at 5:59 PM Jakub Kicinski <kuba@...nel.org>
> wrote:
> > > >>
> > > >> On Tue, 4 Oct 2022 17:25:51 -0700 Martin KaFai Lau wrote:
> > > >> > A intentionally wild question, what does it take for the driver
> > > to return the
> > > >> > hints. Is the rx_desc and rx_queue enough? When the xdp prog
> > > is calling a
> > > >> > kfunc/bpf-helper, like 'hwtstamp = bpf_xdp_get_hwtstamp()', can
> > > the driver
> > > >> > replace it with some inline bpf code (like how the inline code
> > > is generated for
> > > >> > the map_lookup helper). The xdp prog can then store the
> > > hwstamp in the meta
> > > >> > area in any layout it wants.
> > > >>
> > > >> Since you mentioned it... FWIW that was always my preference
> > > rather than
> > > >> the BTF magic :) The jited image would have to be per-driver like
> we
> > > >> do for BPF offload but that's easy to do from the technical
> > > >> perspective (I doubt many deployments bind the same prog to
> multiple
> > > >> HW devices)..
> > > >
> > > > +1, sounds like a good alternative (got your reply while typing)
> > > > I'm not too versed in the rx_desc/rx_queue area, but seems like
> worst
> > > > case that bpf_xdp_get_hwtstamp can probably receive a xdp_md ctx and
> > > > parse it out from the pre-populated metadata?
> > > >
> > > > Btw, do we also need to think about the redirect case? What happens
> > > > when I redirect one frame from a device A with one metadata format
> to
> > > > a device B with another?
> >
> > > Yes, we absolutely do! In fact, to me this (redirects) is the main
> > > reason why we need the ID in the packet in the first place: when
> running
> > > on (say) a veth, an XDP program needs to be able to deal with packets
> > > from multiple physical NICs.
> >
> > > As far as API is concerned, my hope was that we could solve this with
> a
> > > CO-RE like approach where the program author just writes something
> like:
> >
> > > hw_tstamp = bpf_get_xdp_hint("hw_tstamp", u64);
> >
> > > and bpf_get_xdp_hint() is really a macro (or a special kind of
> > > relocation?) and libbpf would do the following on load:
> >
> > > - query the kernel BTF for all possible xdp_hint structs
> > > - figure out which of them have an 'u64 hw_tstamp' member
> > > - generate the necessary conditionals / jump table to disambiguate on
> > > the BTF_ID in the packet
> >
> >
> > > Now, if this is better done by a kfunc I'm not terribly opposed to
> that
> > > either, but I'm not sure it's actually better/easier to do in the
> kernel
> > > than in libbpf at load time?
> >
> > Replied in the other thread, but to reiterate here: then btf_id in the
> > metadata has to stay and we either pre-generate those bpf_get_xdp_hint()
> > at libbpf or at kfunc load time level as you mention.
> >
> > But the program essentially has to handle all possible hints' btf ids
> > thrown
> > at it by the system. Not sure about the performance in this case :-/
> > Maybe that's something that can be hidden behind "I might receive
> forwarded
> > packets and I know how to handle all metadata format" flag? By default,
> > we'll pre-generate parsing only for that specific device?
> I did a simple POC of Jespers xdp-hints with AF-XDP and CNDP (Cloud Native
> Data Plane). In the cases where my app had access to the HW I didn't need
> to
> handle all possible hints... I knew what Drivers were on the system and
> they
> were the hints I needed to deal with.
> So at program init time I registered the relevant BTF_IDs (and some
> callback
> functions to handle them) from the NICs that were available to me in a
> simple tailq (tbh there were so few I could've probably used a static
> array).
> When processing the hints then I only needed to invoke the appropriate
> callback function based on the received BTF_ID. I didn't have a massive
> chains of if...else if... else statements.
> In the case where we have redirection to a virtual NIC and we don't
> necessarily know the underlying hints that are exposed to the app, could
> we
> not still use the xdp_hints (as proposed by Jesper) themselves to indicate
> the relevant drivers to the application? or even indicate them via a map
> or
> something?
Ideally this all should be handled by the common infra (libbpf/libxdp?).
We probably don't want every xdp/af_xdp user to custom-implement all this
btf_id->layout parsing? That's why the request for a selftest that shows
how metadata can be accessed from bpf/af_xdp.
Powered by blists - more mailing lists