lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKH8qBswBu7QAWySWOYK4X41mwpdBj0z=6A9WBHjVYQFq9Pzjw@mail.gmail.com>
Date:   Thu, 8 Dec 2022 18:57:50 -0800
From:   Stanislav Fomichev <sdf@...gle.com>
To:     Toke Høiland-Jørgensen <toke@...hat.com>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>, Hao Luo <haoluo@...gle.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Saeed Mahameed <saeedm@...dia.com>,
        David Ahern <dsahern@...il.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Willem de Bruijn <willemb@...gle.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Anatoly Burakov <anatoly.burakov@...el.com>,
        Alexander Lobakin <alexandr.lobakin@...el.com>,
        Magnus Karlsson <magnus.karlsson@...il.com>,
        Maryam Tahhan <mtahhan@...hat.com>, xdp-hints@...-project.net,
        Network Development <netdev@...r.kernel.org>
Subject: Re: [xdp-hints] Re: [PATCH bpf-next v3 11/12] mlx5: Support RX XDP metadata

On Thu, Dec 8, 2022 at 4:54 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
>
> > On Thu, Dec 8, 2022 at 4:29 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
> >>
> >> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
> >>
> >> > On Thu, Dec 8, 2022 at 4:02 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
> >> >>
> >> >> Stanislav Fomichev <sdf@...gle.com> writes:
> >> >>
> >> >> > On Thu, Dec 8, 2022 at 2:59 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
> >> >> >>
> >> >> >> Stanislav Fomichev <sdf@...gle.com> writes:
> >> >> >>
> >> >> >> > From: Toke Høiland-Jørgensen <toke@...hat.com>
> >> >> >> >
> >> >> >> > Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe
> >> >> >> > pointer to the mlx5e_skb_from* functions so it can be retrieved from the
> >> >> >> > XDP ctx to do this.
> >> >> >>
> >> >> >> So I finally managed to get enough ducks in row to actually benchmark
> >> >> >> this. With the caveat that I suddenly can't get the timestamp support to
> >> >> >> work (it was working in an earlier version, but now
> >> >> >> timestamp_supported() just returns false). I'm not sure if this is an
> >> >> >> issue with the enablement patch, or if I just haven't gotten the
> >> >> >> hardware configured properly. I'll investigate some more, but figured
> >> >> >> I'd post these results now:
> >> >> >>
> >> >> >> Baseline XDP_DROP:         25,678,262 pps / 38.94 ns/pkt
> >> >> >> XDP_DROP + read metadata:  23,924,109 pps / 41.80 ns/pkt
> >> >> >> Overhead:                   1,754,153 pps /  2.86 ns/pkt
> >> >> >>
> >> >> >> As per the above, this is with calling three kfuncs/pkt
> >> >> >> (metadata_supported(), rx_hash_supported() and rx_hash()). So that's
> >> >> >> ~0.95 ns per function call, which is a bit less, but not far off from
> >> >> >> the ~1.2 ns that I'm used to. The tests where I accidentally called the
> >> >> >> default kfuncs cut off ~1.3 ns for one less kfunc call, so it's
> >> >> >> definitely in that ballpark.
> >> >> >>
> >> >> >> I'm not doing anything with the data, just reading it into an on-stack
> >> >> >> buffer, so this is the smallest possible delta from just getting the
> >> >> >> data out of the driver. I did confirm that the call instructions are
> >> >> >> still in the BPF program bytecode when it's dumped back out from the
> >> >> >> kernel.
> >> >> >>
> >> >> >> -Toke
> >> >> >>
> >> >> >
> >> >> > Oh, that's great, thanks for running the numbers! Will definitely
> >> >> > reference them in v4!
> >> >> > Presumably, we should be able to at least unroll most of the
> >> >> > _supported callbacks if we want, they should be relatively easy; but
> >> >> > the numbers look fine as is?
> >> >>
> >> >> Well, this is for one (and a half) piece of metadata. If we extrapolate
> >> >> it adds up quickly. Say we add csum and vlan tags, say, and maybe
> >> >> another callback to get the type of hash (l3/l4). Those would probably
> >> >> be relevant for most packets in a fairly common setup. Extrapolating
> >> >> from the ~1 ns/call figure, that's 8 ns/pkt, which is 20% of the
> >> >> baseline of 39 ns.
> >> >>
> >> >> So in that sense I still think unrolling makes sense. At least for the
> >> >> _supported() calls, as eating a whole function call just for that is
> >> >> probably a bit much (which I think was also Jakub's point in a sibling
> >> >> thread somewhere).
> >> >
> >> > imo the overhead is tiny enough that we can wait until
> >> > generic 'kfunc inlining' infra is ready.
> >> >
> >> > We're planning to dual-compile some_kernel_file.c
> >> > into native arch and into bpf arch.
> >> > Then the verifier will automatically inline bpf asm
> >> > of corresponding kfunc.
> >>
> >> Is that "planning" or "actively working on"? Just trying to get a sense
> >> of the time frames here, as this sounds neat, but also something that
> >> could potentially require quite a bit of fiddling with the build system
> >> to get to work? :)
> >
> > "planning", but regardless how long it takes I'd rather not
> > add any more tech debt in the form of manual bpf asm generation.
> > We have too much of it already: gen_lookup, convert_ctx_access, etc.
>
> Right, I'm no fan of the manual ASM stuff either. However, if we're
> stuck with the function call overhead for the foreseeable future, maybe
> we should think about other ways of cutting down the number of function
> calls needed?
>
> One thing I can think of is to get rid of the individual _supported()
> kfuncs and instead have a single one that lets you query multiple
> features at once, like:
>
> __u64 features_supported, features_wanted = XDP_META_RX_HASH | XDP_META_TIMESTAMP;
>
> features_supported = bpf_xdp_metadata_query_features(ctx, features_wanted);
>
> if (features_supported & XDP_META_RX_HASH)
>   hash = bpf_xdp_metadata_rx_hash(ctx);
>
> ...etc

I'm not too happy about having the bitmasks tbh :-(
If we want to get rid of the cost of those _supported calls, maybe we
can do some kind of libbpf-like probing? That would require loading a
program + waiting for some packet though :-(

Or maybe they can just be cached for now?

if (unlikely(!got_first_packet)) {
  have_hash = bpf_xdp_metadata_rx_hash_supported();
  have_timestamp = bpf_xdp_metadata_rx_timestamp_supported();
  got_first_packet = true;
}

if (have_hash) {}
if (have_timestamp) {}

That should hopefully work until generic inlining infra?

> -Toke
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ