[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKH8qBv7nWdknuf3ap_ekpAhMgvtmoJhZ3-HRuL8Wv70SBWMSQ@mail.gmail.com>
Date: Thu, 8 Dec 2022 15:45:58 -0800
From: Stanislav Fomichev <sdf@...gle.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: bpf@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, martin.lau@...ux.dev, song@...nel.org,
yhs@...com, john.fastabend@...il.com, kpsingh@...nel.org,
haoluo@...gle.com, jolsa@...nel.org,
Saeed Mahameed <saeedm@...dia.com>,
David Ahern <dsahern@...il.com>,
Jakub Kicinski <kuba@...nel.org>,
Willem de Bruijn <willemb@...gle.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Anatoly Burakov <anatoly.burakov@...el.com>,
Alexander Lobakin <alexandr.lobakin@...el.com>,
Magnus Karlsson <magnus.karlsson@...il.com>,
Maryam Tahhan <mtahhan@...hat.com>, xdp-hints@...-project.net,
netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v3 11/12] mlx5: Support RX XDP metadata
On Thu, Dec 8, 2022 at 2:59 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>
> Stanislav Fomichev <sdf@...gle.com> writes:
>
> > From: Toke Høiland-Jørgensen <toke@...hat.com>
> >
> > Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe
> > pointer to the mlx5e_skb_from* functions so it can be retrieved from the
> > XDP ctx to do this.
>
> So I finally managed to get enough ducks in row to actually benchmark
> this. With the caveat that I suddenly can't get the timestamp support to
> work (it was working in an earlier version, but now
> timestamp_supported() just returns false). I'm not sure if this is an
> issue with the enablement patch, or if I just haven't gotten the
> hardware configured properly. I'll investigate some more, but figured
> I'd post these results now:
>
> Baseline XDP_DROP: 25,678,262 pps / 38.94 ns/pkt
> XDP_DROP + read metadata: 23,924,109 pps / 41.80 ns/pkt
> Overhead: 1,754,153 pps / 2.86 ns/pkt
>
> As per the above, this is with calling three kfuncs/pkt
> (metadata_supported(), rx_hash_supported() and rx_hash()). So that's
> ~0.95 ns per function call, which is a bit less, but not far off from
> the ~1.2 ns that I'm used to. The tests where I accidentally called the
> default kfuncs cut off ~1.3 ns for one less kfunc call, so it's
> definitely in that ballpark.
>
> I'm not doing anything with the data, just reading it into an on-stack
> buffer, so this is the smallest possible delta from just getting the
> data out of the driver. I did confirm that the call instructions are
> still in the BPF program bytecode when it's dumped back out from the
> kernel.
>
> -Toke
>
Oh, that's great, thanks for running the numbers! Will definitely
reference them in v4!
Presumably, we should be able to at least unroll most of the
_supported callbacks if we want, they should be relatively easy; but
the numbers look fine as is?
Powered by blists - more mailing lists