[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <649637e91a709_7bea820894@john.notmuch>
Date: Fri, 23 Jun 2023 17:25:13 -0700
From: John Fastabend <john.fastabend@...il.com>
To: Donald Hunter <donald.hunter@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Stanislav Fomichev <sdf@...gle.com>,
bpf <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Song Liu <song@...nel.org>,
Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>,
Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>,
Network Development <netdev@...r.kernel.org>
Subject: Re: [RFC bpf-next v2 11/11] net/mlx5e: Support TX timestamp metadata
Donald Hunter wrote:
> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
>
> > On Thu, Jun 22, 2023 at 3:13 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> >>
> >> We want to provide common sane interfaces/abstractions via kfuncs.
> >> That will make most BPF programs portable from mlx to brcm (for
> >> example) without doing a rewrite.
> >> We're also exposing raw (readonly) descriptors (via that get_ctx
> >> helper) to the users who know what to do with them.
> >> Most users don't know what to do with raw descriptors;
> >
> > Why do you think so?
> > Who are those users?
> > I see your proposal and thumbs up from onlookers.
> > afaict there are zero users for rx side hw hints too.
>
> We have customers in various sectors that want to use rx hw timestamps.
>
> There are several use cases especially in Telco where they use DPDK
> today and want to move to AF_XDP but they need to be able to benefit
> from the hw capabilities of the NICs they purchase. Not having access to
> hw offloads on rx and tx is a barrier to AF_XDP adoption.
>
> The most notable gaps in AF_XDP are checksum offloads and multi buffer
> support.
>
> >> the specs are
> >> not public; things can change depending on fw version/etc/etc.
> >> So the progs that touch raw descriptors are not the primary use-case.
> >> (that was the tl;dr for rx part, seems like it applies here?)
> >>
> >> Let's maybe discuss that mlx5 example? Are you proposing to do
> >> something along these lines?
> >>
> >> void mlx5e_devtx_submit(struct mlx5e_tx_wqe *wqe);
> >> void mlx5e_devtx_complete(struct mlx5_cqe64 *cqe);
> >>
> >> If yes, I'm missing how we define the common kfuncs in this case. The
> >> kfuncs need to have some common context. We're defining them with:
> >> bpf_devtx_<kfunc>(const struct devtx_frame *ctx);
> >
> > I'm looking at xdp_metadata and wondering who's using it.
> > I haven't seen a single bug report.
> > No bugs means no one is using it. There is zero chance that we managed
> > to implement it bug-free on the first try.
>
> Nobody is using xdp_metadata today, not because they don't want to, but
> because there was no consensus for how to use it. We have internal POCs
> that use xdp_metadata and are still trying to build the foundations
> needed to support it consistently across different hardware. Jesper
> Brouer proposed a way to describe xdp_metadata with BTF and it was
> rejected. The current plan to use kfuncs for xdp hints is the most
> recent attempt to find a solution.
The hold up on my side is getting it in a LST kernel so we can get it
deployed in real environments. Although my plan is to just cast the
ctx to a kernel ctx and read the data out we need.
>
> > So new tx side things look like a feature creep to me.
> > rx side is far from proven to be useful for anything.
> > Yet you want to add new things.
>From my side if we just had a hook there and could cast the ctx to
something BTF type safe so we can simply read through the descriptor
I think that would sufficient for many use cases. To write into the
descriptor that might take more thought a writeable BTF flag?
>
> We have telcos and large enterprises that either use DPDK or use
> proprietary solutions for getting traffic to user space. They want to
> move to AF_XDP but without at least RX and TX checksum offloads they are
> paying a CPU tax for using AF_XDP. One customer is also waiting for
> multi-buffer support to land before they can adopt AF_XDP.
>
> So, no it's not feature creep, it's a set of required features to reach
> minimum viable product to be able to move out of a lab and replace
> legacy in production.
Powered by blists - more mailing lists