netdev - Re: [PATCHv11 bpf-next 2/5] xdp: add a new helper for dev map multicast support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200911095820.304d9877@carbon>
Date:   Fri, 11 Sep 2020 09:58:20 +0200
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     David Ahern <dsahern@...il.com>
Cc:     Toke Høiland-Jørgensen <toke@...hat.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Hangbin Liu <liuhangbin@...il.com>, bpf <bpf@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>,
        Jiri Benc <jbenc@...hat.com>,
        Eelco Chaudron <echaudro@...hat.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Lorenzo Bianconi <lorenzo.bianconi@...hat.com>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>, brouer@...hat.com
Subject: Re: [PATCHv11 bpf-next 2/5] xdp: add a new helper for dev map
 multicast support

On Thu, 10 Sep 2020 12:35:33 -0600
David Ahern <dsahern@...il.com> wrote:

> On 9/10/20 11:50 AM, Jesper Dangaard Brouer wrote:
> > Maybe we should change the devmap-prog approach, and run this on the
> > xdp_frame's (in bq_xmit_all() to be precise) .  Hangbin's patchset
> > clearly shows that we need this "layer" between running the xdp_prog and
> > the devmap-prog.   
> 
> I would prefer to leave it in dev_map_enqueue.
> 
> The main premise at the moment is that the program attached to the
> DEVMAP entry is an ACL specific to that dev. If the program is going to
> drop the packet, then no sense queueing it.
> 
> I also expect a follow on feature will be useful to allow the DEVMAP
> program to do another REDIRECT (e.g., potentially after modifying). It
> is not handled at the moment as it needs thought - e.g., limiting the
> number of iterative redirects. If such a feature does happen, then no
> sense queueing it to the current device.

It makes a lot of sense to do queuing before redirecting again.  The
(hidden) bulking we do at XDP redirect is the primary reason for the
performance boost. We all remember performance difference between
non-map version of redirect (which Toke fixed via always having the
bulking available in net_device->xdp_bulkq).

In a simple micro-benchmark I bet it will look better running the
devmap-prog right after the xdp_prog (which is what we have today). But
I claim this is the wrong approach, as soon as (1) traffic is more
intermixed, and (2) devmap-prog gets bigger and becomes more specific
to the egress-device (e.g. BPF update constants per egress-device).
When this happens performance suffers, as I-cache and data-access to
each egress-device gets pushed out of cache. (Hint VPP/fd.io approach)

Queuing xdp_frames up for your devmap-prog makes sense, as these share
common properties.  With intermix traffic the first xdp_prog will sort
packets into egress-devices, and then the devmap-prog can operate on
these.  The best illustration[1] of this sorting I saw in a Netflix
blogpost[2] about FreeBSD, section "RSS Assisted LRO" (not directly
related, but illustration was good).

[1] https://miro.medium.com/max/700/1%2alTGL1_D6hTMEMa7EDV8yZA.png
[2] https://netflixtechblog.com/serving-100-gbps-from-an-open-connect-appliance-cdb51dda3b99
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer