netdev - Re: [net-next V6 PATCH 0/5] New bpf cpumap type for XDP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a430c181-aa56-61a7-fc59-9b135bbb262b@gmail.com>
Date:   Tue, 10 Oct 2017 23:10:39 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>, netdev@...r.kernel.org
Cc:     jakub.kicinski@...ronome.com,
        "Michael S. Tsirkin" <mst@...hat.com>, pavel.odintsov@...il.com,
        Jason Wang <jasowang@...hat.com>, mchan@...adcom.com,
        peter.waskiewicz.jr@...el.com,
        Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Andy Gospodarek <andy@...yhouse.net>
Subject: Re: [net-next V6 PATCH 0/5] New bpf cpumap type for XDP_REDIRECT

On 10/10/2017 05:47 AM, Jesper Dangaard Brouer wrote:
> Introducing a new way to redirect XDP frames.  Notice how no driver
> changes are necessary given the design of XDP_REDIRECT.
> 
> This redirect map type is called 'cpumap', as it allows redirection
> XDP frames to remote CPUs.  The remote CPU will do the SKB allocation
> and start the network stack invocation on that CPU.
> 
> This is a scalability and isolation mechanism, that allow separating
> the early driver network XDP layer, from the rest of the netstack, and
> assigning dedicated CPUs for this stage.  The sysadm control/configure
> the RX-CPU to NIC-RX queue (as usual) via procfs smp_affinity and how
> many queues are configured via ethtool --set-channels.  Benchmarks
> show that a single CPU can handle approx 11Mpps.  Thus, only assigning
> two NIC RX-queues (and two CPUs) is sufficient for handling 10Gbit/s
> wirespeed smallest packet 14.88Mpps.  Reducing the number of queues
> have the advantage that more packets being "bulk" available per hard
> interrupt[1].
> 
> [1] https://www.netdevconf.org/2.1/papers/BusyPollingNextGen.pdf
> 
> Use-cases:
> 
> 1. End-host based pre-filtering for DDoS mitigation.  This is fast
>    enough to allow software to see and filter all packets wirespeed.
>    Thus, no packets getting silently dropped by hardware.
> 
> 2. Given NIC HW unevenly distributes packets across RX queue, this
>    mechanism can be used for redistribution load across CPUs.  This
>    usually happens when HW is unaware of a new protocol.  This
>    resembles RPS (Receive Packet Steering), just faster, but with more
>    responsibility placed on the BPF program for correct steering.

Hi Jesper,

Another (somewhat meta) comment about the performance benchmarks. In
one of the original threads you showed that the XDP cpu map outperformed
RPS in TCP_CRR netperf tests. It was significant iirc in the mpps range.

But, with this series we will skip GRO. Do you have any idea how this
looks with other tests such as TCP_STREAM? I'm trying to understand
if this is something that can be used in the general case or is more
for the special case and will have to be enabled/disabled by the
orchestration layer depending on workload/network conditions.

My intuition is the general case will be slower due to lack of GRO. If
this is the case any ideas how we could add GRO? Not needed in the
initial patchset but trying to see if the two are mutually exclusive. I
don't off-hand see an easy way to pull GRO into this feature.

Thanks,
John