netdev - Re: [PATCH bpf] xsk: improve documentation for AF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJ8uoz0CWRavNXTrEhbS3tv1mkQD+VF7t=b-0VppO4c6Am_x8Q@mail.gmail.com>
Date:   Mon, 14 Oct 2019 10:03:29 +0200
From:   Magnus Karlsson <magnus.karlsson@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Magnus Karlsson <magnus.karlsson@...el.com>,
        Björn Töpel <bjorn.topel@...el.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Network Development <netdev@...r.kernel.org>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf] xsk: improve documentation for AF_XDP

On Sat, Oct 12, 2019 at 6:52 PM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Thu, Oct 10, 2019 at 1:12 AM Magnus Karlsson
> <magnus.karlsson@...el.com> wrote:
> >
> > Added sections on all the bind flags, libbpf, all the setsockopts and
> > all the getsockopts. Also updated the document to reflect the latest
> > features and to correct some spelling errors.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@...el.com>
>
> thanks for the update. Overall looks good.
> Few nits below:

Will fix all your comments and send out a v2.

Thanks: Magnus

> > +What socket will then a packet arrive on? This is decided by the XDP
> > +program. Put all the sockets in the XSK_MAP and just indicate which
> > +index in the array you would like to send each packet to. A simple
> > +round-robin example of distributing packets is shown below:
> > +
> > +.. code-block:: c
> > +
> > +   #define KBUILD_MODNAME "af_xdp_example"
>
> what is this for?
> It's not a kernel module.
>
> > +   #include <uapi/linux/bpf.h>
>
> why 'uapi' ? It should use only user space headers.
>
> > +   #include "bpf_helpers.h"
> > +
> > +   #define MAX_SOCKS 16
> > +
> > +   struct bpf_map_def SEC("maps") xsks_map = {
> > +         .type = BPF_MAP_TYPE_XSKMAP,
> > +         .key_size = sizeof(int),
> > +         .value_size = sizeof(int),
> > +         .max_entries = MAX_SOCKS,
> > +   };
>
> Could you switch to BTF defined maps?
> libbpf will forever support old style as well,
> but documentation should point to the latest.
>
> > +
> > +   struct bpf_map_def SEC("maps") rr_map = {
> > +         .type = BPF_MAP_TYPE_PERCPU_ARRAY,
> > +         .key_size = sizeof(int),
> > +         .value_size = sizeof(unsigned int),
> > +         .max_entries = 1,
> > +   };
> > +
> > +   SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
> > +   {
> > +       int key = 0, idx;
> > +       unsigned int *rr;
> > +
> > +       rr = bpf_map_lookup_elem(&rr_map, &key);
> > +       if (!rr)
> > +          return XDP_ABORTED;
> > +
> > +       *rr = (*rr + 1) & (MAX_SOCKS - 1);
> > +       idx = *rr;
> > +
> > +       return bpf_redirect_map(&xsks_map, idx, 0);
> > +   }
> > +
> > +   char _license[] SEC("license") = "GPL";
>
> Above sample doesn't use gpl-only helpers. Why add above line?
>
> > +.. code-block:: c
> > +
> > +   if (xsk_ring_prod__needs_wakeup(&my_tx_ring))
> > +      sendto(xsk_socket__fd(xsk_handle), NULL, 0, MSG_DONTWAIT, NULL, 0);
> > +
> > +I.e., only use the syscall if the flag is set.
> > +
> > +We recommend that you always enable this mode as it can lead to
> > +magnitudes better performance if you run the application and the
> > +driver on the same core and somewhat better performance even if you
> > +use different cores for the application and the kernel driver, as it
> > +reduces the number of syscalls needed for the TX path.
>
> "magnitudes better performance"? Is it really at least 20 times better?
>
> > -Naive ring dequeue and enqueue could look like this::
> > +Naive ring dequeue and enqueue could look like this:
>
> lol. That's a good typo.