[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQKUVDEg12jOc=5iKmfN-aHvFEtvFKVEDBFsmZizwkXT4w@mail.gmail.com>
Date: Fri, 23 Jun 2023 19:52:03 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: Donald Hunter <donald.hunter@...il.com>, Stanislav Fomichev <sdf@...gle.com>, bpf <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, Song Liu <song@...nel.org>,
Yonghong Song <yhs@...com>, KP Singh <kpsingh@...nel.org>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, Network Development <netdev@...r.kernel.org>
Subject: Re: [RFC bpf-next v2 11/11] net/mlx5e: Support TX timestamp metadata
On Fri, Jun 23, 2023 at 5:25 PM John Fastabend <john.fastabend@...il.com> wrote:
>
> Donald Hunter wrote:
> > Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
> >
> > > On Thu, Jun 22, 2023 at 3:13 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > >>
> > >> We want to provide common sane interfaces/abstractions via kfuncs.
> > >> That will make most BPF programs portable from mlx to brcm (for
> > >> example) without doing a rewrite.
> > >> We're also exposing raw (readonly) descriptors (via that get_ctx
> > >> helper) to the users who know what to do with them.
> > >> Most users don't know what to do with raw descriptors;
> > >
> > > Why do you think so?
> > > Who are those users?
> > > I see your proposal and thumbs up from onlookers.
> > > afaict there are zero users for rx side hw hints too.
> >
> > We have customers in various sectors that want to use rx hw timestamps.
> >
> > There are several use cases especially in Telco where they use DPDK
> > today and want to move to AF_XDP but they need to be able to benefit
> > from the hw capabilities of the NICs they purchase. Not having access to
> > hw offloads on rx and tx is a barrier to AF_XDP adoption.
> >
> > The most notable gaps in AF_XDP are checksum offloads and multi buffer
> > support.
> >
> > >> the specs are
> > >> not public; things can change depending on fw version/etc/etc.
> > >> So the progs that touch raw descriptors are not the primary use-case.
> > >> (that was the tl;dr for rx part, seems like it applies here?)
> > >>
> > >> Let's maybe discuss that mlx5 example? Are you proposing to do
> > >> something along these lines?
> > >>
> > >> void mlx5e_devtx_submit(struct mlx5e_tx_wqe *wqe);
> > >> void mlx5e_devtx_complete(struct mlx5_cqe64 *cqe);
> > >>
> > >> If yes, I'm missing how we define the common kfuncs in this case. The
> > >> kfuncs need to have some common context. We're defining them with:
> > >> bpf_devtx_<kfunc>(const struct devtx_frame *ctx);
> > >
> > > I'm looking at xdp_metadata and wondering who's using it.
> > > I haven't seen a single bug report.
> > > No bugs means no one is using it. There is zero chance that we managed
> > > to implement it bug-free on the first try.
> >
> > Nobody is using xdp_metadata today, not because they don't want to, but
> > because there was no consensus for how to use it. We have internal POCs
> > that use xdp_metadata and are still trying to build the foundations
> > needed to support it consistently across different hardware. Jesper
> > Brouer proposed a way to describe xdp_metadata with BTF and it was
> > rejected. The current plan to use kfuncs for xdp hints is the most
> > recent attempt to find a solution.
>
> The hold up on my side is getting it in a LST kernel so we can get it
> deployed in real environments. Although my plan is to just cast the
> ctx to a kernel ctx and read the data out we need.
+1
> >
> > > So new tx side things look like a feature creep to me.
> > > rx side is far from proven to be useful for anything.
> > > Yet you want to add new things.
>
> From my side if we just had a hook there and could cast the ctx to
> something BTF type safe so we can simply read through the descriptor
> I think that would sufficient for many use cases. To write into the
> descriptor that might take more thought a writeable BTF flag?
That's pretty much what I'm suggesting.
Add two driver specific __weak nop hook points where necessary
with few driver specific kfuncs.
Don't build generic infra when it's too early to generalize.
It would mean that bpf progs will be driver specific,
but when something novel like this is being proposed it's better
to start with minimal code change to core kernel (ideally none)
and when common things are found then generalize.
Sounds like Stanislav use case is timestamps in TX
while Donald's are checksums on RX, TX. These use cases are too different.
To make HW TX checksum compute checksum driven by AF_XDP
a lot more needs to be done than what Stan is proposing for timestamps.
Powered by blists - more mailing lists